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methods  for  developing  efficient  Monte  Carlo  simulation.  Each  volume  presents 
techniques  for  reducing  computational  effort  in  one  of  the  following  areas: 

Vol.  I  -  Selecting  Probability  Distributions,  Vol.  n  -  Random  Number 
Generation  for  Selected  Probability  Distributions,  and  Vol.  m  -  Variance 
Reduction. 

This  volume  provides  a  straightforward  approach  and  associated  techniques 
for  selecting  the  most  appropriate  probability  distributions  for  use  in  Monte 
Carlo  simulations.  Part  I,  BASIC  CONSIDERATIONS,  presents  the  underlying 
concepts  and  principles  for  selecting  probability  distributions.  Part  n, 
SELECTION  OF  DISTRIBUTIONS,  gives  the  mathematical  models  representing 
stochastic  processes  and  presents  step-by-step  procedures  for  identification 
and  selection  of  the  appropriate  probability  distributions  based  upon  die  degree 
of  knowledge  and  available  data  for  the  random  variable  under  study. 
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ABSTRACT 

This  document  is  the  first  of  three  volumes  which 
present  techniques  and  methods  for  developing  efficient  Monte 
Carlo  simulation:  Each  volume  presents  techniques  for  re¬ 
ducing  computational  effort  in  one  of  the  following  areas: 

Vol.  I  -  Selecting  Probability  Distributions,  Vol.  n  -  Random 
Number  Generation  For  Selected  Probability  Distributions, 
and  Vol.  m  -  Variance  Reduction. 

This  volume  provides  a  straightforward  approach  and 
associated  techniques  for  selecting  the  most  appropriate  pro¬ 
bability  distributions  for  use  in  Monte  Carlo  simulations.  Past 
I,  BASIC  CONSIDERATIONS,  presents  the  underlyj^  concepts 
and  principles  for  selecting  probability  distributions.  Part  n, 
SELECTION  OF  DISTRIBUTIONS,  gives  the  mathematics^ models 
representing  stochastic  processes  and  presents  s^ep-by-step 
procedures  for  identification  and  selection  of  the  appropriate 
probability  distributions  based  upon  the  degree  of  knowledge  and 
available  data  for  the  random  variable  under  study. 
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EXECUTIVE  SUMMARY 


Monte  Carlo  simulation  is  one  of  the  most  powerful  and  commonly 
used  techniques  for  analyzing  complex  physical  problems.  Applications 
can  be  found  in  many  diverse  areas  from  radiation  transport  to  river  basin 
modeling.  Important  Navy  applications  include:  analysis  of  antisubmarine 
warfare  exercises  and  operations,  prediction  of  aircraft  or  sensor  perform¬ 
ance,  tactical  analysis,  and  matrix  game  solutions  where  random  processes 
are  considered  to  be  of  particular  importance.  The  range  of  applications 
has  been  broadening  and  the  size,  complexity,  and  computational  effort  re¬ 
quired  have  been  increasing.  However,  such  developments  are  expected 
and  desirable  since  increased  realism  is  concomitant  with  more  complex  and 
extensive  problem  descriptions. 

In  recognition  of  such  trends,  the  requirements  for  improved  simu¬ 
lation  technifjues  are  becoming  more  pressing.  Unfortunately,  methods  for 
achieving  greater  efficiency  are  frequently  overlooked  in  developing  simula¬ 
tions.  This  can  generally  be  attributed  to  one  or  more  of  the  following 
reasons: 

•  Analysts  usually  seek  advanced  computer  systems  to 
perform  more  complex  simulation  studies  by  exploit¬ 
ing  increased  speed  and/or  storage  capabilities.  This 
is  often  achieved  at  a  considerably  increased  expense. 

.  t  Many  efficient  simulation  methods  have  evolved  for 
specialized  applications.  For  example,  some  of  the 
most  impressive  Monte  Carlo  techniques  have  been 
developed  in  radiation  transport,  a  discipline  that  does 
not  overlap  into  areas  where  even  a  small  number  of 
simulation  analysts  are  working. 

•  Xnown  techniques  are  not  developed  to  the  point  where 
they  can  be  easily  understood  or  applied  by  even  a 
small  fraction  of  the  analysts  who  are  performing  simu¬ 
lation  studies  or  developing  simulation  models. 
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In  addition  to  the  above  reasons,  comprehensive  references  describing 
efficient  methodologies  to  improve  Monte  Carlo  simulation  are  not  avail¬ 
able.  It  is  the  intent  of  these  volumes  to  help  alleviate  the  above  short¬ 
comings  in  Monte  Carlo  simulation. 

This  document  is  the  first  of  three  volumes  which  present  techniques 
and  methods  for  developing  efficient  Monte  Carlo  simulations.  Each  volume 
is  essentially  a  self-contained  discussion  of  useful  techniques  which  can  be 
applied  in  reducing  computational  effort  in  one  of  the  following  three  major 
aspects  of  Monte  Carlo  simulation: 

e  Selecting  Probability  Distributions  -  Volume  I 

•  Random  Number  Generation  for  Selected  Probability 
Distributions  -  Volume  n 

e  Variance  Reduction  -  Volume  in 

The  purpose  of  these  volumes  is  to  provide  guidance  in  developing 
Monte  Carlo  simulations  that  accurately  reflect  the  behavior  of  various 
characteristics  of  the  system  being  simulated  and  are  most  efficient  in 
terms  of  computational  effort.  The  basic  intent  is  to  provide  understanding 
of  the  concepts  and  methods  for  reducing  analysis  and  computational  effort 
as  well  as  to  serve  as  a  practical  guide  for  their  application.  They  have 
been  prepared  primarily  for  the  systems  analyst  and  computer  programmer 
who  have  a  basic  background  and  experinece  in  simulation  and  elementary 
statistics.  Thus,  the  material  is  presented  so  as  to  preclude  extensive 
knowledge  of  statistical  techniques  or  of  extensive  literature  search.  How¬ 
ever,  it  is  assumed  the  reader  has  a  grasp  of  the  fundamentals  of  Monte 
Carlo  methods,  simulation  modeling,  and  elementary  statistics. 


vlli 


1.  INTRODUCTION 


The  starting  point  in  developing  any  Monte  Carlo  simulation  is  the 
construction  of  mathematical  models  which  describe  the  stochastic  be¬ 
havior  of  the  variables  in  the  process  under  study.  When  the  underlying 
processes  are  well  understood  and  the  functional  forms  of  the  variables 
are  known,  development  of  a  model  is  straightforward.  However,  in  many 
applications  the  exact  functional  form  of  the  variable  is  not  known,  thus  re¬ 
quiring  selection  from  among  a  myriad  of  possible  distributions  to  find  the 
one  that:  will  best  represent  the  process.  This  volume  provides  a  straight¬ 
forward  approach  and  associated  techniques  for  selecting  the  most  appro¬ 
priate  probability  distributions  for  use  in  Monte  Carlo  simulations. 

Part  I  of  this  volume,  BASIC  CONSIDERATIONS,  presents  the  under¬ 
lying  concepts  and  principles  to  be  used  in  the  selection  of  probability  dis¬ 
tributions.  This  background  information  provides  the  reader  with  an  under¬ 
standing  of  the  important  considerations,  tasks,  and  methods  and  procedures 
involved  in  dealing  with  simulation  events  characterized  by  random  variables. 

Following  Part  I,  the  reader  will  find  in  Part  n,  SELECTION  OF 
DISTRIBUTIONS,  the  mathematical  models  which  will  represent  the  stochastic 
behavior  of  the  process  as  accurately  as  the  data  and  understanding  of  the 
processes  will  allow.  Part  n  presents  step-by-step  procedures  for  the 
identification  and  selection  of  appropriate  probability  distributions.  Part  n 
applies  the  rationale  developed  in  Part  I  to  the  problems  of  developing  dis¬ 
tributions  based  on  varying  amounts  of  data  and  depth  of  understanding  of 
the  processes  being  simulated. 

This  volume  also  includes  additional  information  useful  in  the  selec¬ 
tion  of  probability  distributions.  Appendix  A  contains  background  information 
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of  the  complex  parametric  families  of  distributions  which  will  be  useful 
for  the  reader  who  has  not  encountered  these  distributions  before.  Appen¬ 
dix  B  contains  tables  which  are  needed  in  making  computations  involving 
distribution  fitting  and  testing.  Appendix  C  is  an  abstracted  bibliography 
of  publications  relating  to  the  subjects  of  probability  distribution  identifica¬ 
tion  and  selection. 
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PART  I 


BASIC  CONSIDERATIONS 


2.  FUNDAMENTALS  OF  DISTRIBUTION  SELECTION 


Selection  of  an  appropriate  probability  distribution  for  a  given 
random  variable  in  a  simulation  requires  gathering  and  evaluating  all 
the  available  facts,  data,  and  knowledge  concerning  each  variable.  It 
is  also  important  to  know  how  the  particular  process  which  any  given 
variable  represents  relates  to  the  entire  simulation  model.  For  Monte 
Carlo  applications  this  includes  careful  investigation  of: 

•  Each  individual  process  or  event 

•  Underlying  theory  of  thf  process 

•  Data  representing  the  variability  of  the  process 

•  Sensitivity  of  the  process  being  simulated  to  probable 
values  of  the  variable 

•  Simulation  programming  considerations 

When  the  variable  under  consideration  is  just  one  among  many  vari¬ 
ables  which  affect  the  overall  problem  or  system,  the  simulation  is  often 
not  very  sensitive  to  the  choice  of  the  distribution.  This  can  be  likened 
to  the  phenomenon  of  summing  a  series  of  random  variables,  none  of  which 
dominates  the  sum.  In  this  case  the  total  tends  to  have  a  normal  distri¬ 
bution  irrespective  of  the  individual  distributions  (see  Refs.  7,27).  In  other 
cases,  the  selection  of  a  distribution  is  more  critical  to  effective  simulation. 
For  example,  when  only  a  few  variables  dominate  the  process  or  the  process 
is  greatly  influenced  by  rare  occurrences  (e.  g. ,  failure  of  a  critical  high 
reliability  component)  the  selection  of  probability  distributions  becomes 
of  paramount  importance.  ^ 

Choosing  the  form  of  probability  distributions  is  often  a  trade¬ 
off  between  theoretical  justification  and  empirical  evidence.  Typically, 
some  form  of  parametric  distribution  can  be  justified,  such  as  the 
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normal,  uniform,  binomial,  or  Bernoulli  distribution.  Available  data 
can  then  be  used  to  estimate  its  parameters.  In  the  absence  of  empirical 
data,  one  is  forced  to  choose  distributions  on  either  theoretical  or  intui¬ 
tive  grounds,  or  often  to  use  several  distributions  and  conduct  sensitivity 
or  worst -case  analyses.  At  the  other  extreme,  where  empirical  data 
is  abundant,  either  the  histogram  can  be  used  or  more  elaborate  para¬ 
metric  models  can  be  employed. 

The  final  choice  of  a  particular  distribution  type  is,  of  course, 
also  dependent  on  ease  of  implementation.  Computer  storage  space, 
computation  time,  and  ease  of  programming  are  key  considerations  in 
most  simulations.  Generating  random  variables  from  a  parametric 
distribution' requires  taking  an  inverse  of  the  cumulative  distribution 
function  or  using  other  random  number  generation  techniques  (see  Vol¬ 
ume  II).  For  some  distributions,  such  as  the  exponential  or  uniform, 
the  inverse  operation  is  a  simple  computation.  For  others,  such  as 
the  normal,  relatively  simple  techniques  are  available.  Histograms 
are  also  fairly  easy  to  use  in  computer  simulations.  Here,  only  a  list 
of  numbers  must  be  stored  (the  more  variable  and  detailed  the  histogram, 
of  course,  the  longer  the  list).  For  many  distributions,  however,  in¬ 
verse  algorithms  for  generation  of  random  numbers  do  not  exist,  and 
other  methods  require  lengthy  computation.  In  this  case,  a  com¬ 
promise  must  be  made  between  ease  of  computation  and  simulation  accu¬ 
racy.  Making  an  estimate  of  how  sensitive  the  total  simulation  will  be 
to  individual  probability  distribution  assumptions  is  important  in  deter¬ 
mining  this  compromise. 

2. 1  BASIS  FOR  MAKING  SELECTIONS 

Before  proceeding  to  the  techniques  of  distribution  selection 
and  their  application  in  simulation  development,  it  is  necessary  to  un¬ 
derstand  the  underlying  concepts  for  making  selections.  Basically,  the 
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selection  process  described  in  Part  II  depends  on  two  factors:  the 
extent  of  knowledge  of  the  process  under  study  (qualitative)  and  the 
amount  of  data  available  (quantitative).  Knowledge  of  the  process  refers 
to  the  level  of  understanding  of  its  behavior  and  characteristics.  For 
example,  it  is  possible  in  some  cases  to  be  quite  certain  that  the  fre¬ 
quency  distribution  of  a  random  variable  is  normal  based  on  familiarity 
with  the  process.  At  the  other  extreme,  little:  or  nothing  may  be  known. 
Similarly,  the  amount  of  data  describing  a  particular  variable  may  range 
from  extensive  to  none.  Each  combination  of  the  state  of  knowledge  and 
amount  of  data  poses  particular  problems  in  selecting  the  most  appro¬ 
priate  distribution. 

2. 2  QUA LITATIVE  BASIS  FOR  SELECTION 

Developing  an  understanding  of  some  random  process  involves 
analysis  to  characterize  the  process.  In  general,  such  efforts  attempt 
to  identify  the  process  on  the  basis  of: 

•  Similarity  to  some  other  process  whose  behavior  is  known 

•  Underlying  theory 

•  Certain  qualitative  aspects. 

Often  a  process  can  be  likened  to  some  other,  the  behavior  at 
which  is  known.  In  such  circumstances,  it  can  be  reasonably  justi¬ 
fied  that  this  known  distribution  might  apply  to  the  one  under  study. 

For  example,  consider  the  simulation  of  a  process  involving  the 
human  performance  of  some  manual  task.  Even  though  the  task  may 
bear  no  particular  resemblance  to  one  in  which  the  distribution  is 
known,  an  assumption  of  similarity  is  reasonable.  The  frequency 
distribution  of  time  of  performance  is  likely  to  be  from  the  same 
family  of  distributions  even  though  the  actual  process  might  be  quite 
different. 
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Many  activities  for  which  stochastic  models  must  be  developed 
can,  at  least  generally,  be  identified  by  some  applicable  theory.  Con¬ 
sider  the  case  in  which  some  repetitive  human  activity  is  involved  such 
as  in  maintenance.  Maintainability  theory  would  indicate  a  strong  like¬ 
lihood  that  the  frequency  distribution  of  time  to  perform  would  have  a 
log  normal  or  a  gamma  distribution.  Similarly,  if  the  failure  of  elec¬ 
tronic  parts  were  to  be  modeled,  it  could  be  assumed  that  an  exponen¬ 
tial  or  possibly  a  Weibull  might  be  applicable  (53).  Such  reasoning  is 
a  fundamental  part  of  the  task  of  distribution  selection. 

There  are,  of  course,  many  situations  in  which  a  theoretical 
basis  for  a  particular  distribution  can  be  established.  Consider  the 
shots  fired  at  a  target  or  the  velocity  of  a  molecule  in  a  stable  solution. 
Under  fairly  weak  conditions  the  velocity  of  the  molecule  or  the  devia¬ 
tion  of  shots  (in  three-dimensional  space)  from  the  bull’s  eye  can  be 
shown  to  have  a  Maxwell  distribution  (27).  The  component  of  velocity 
in  any  direction  or  the  projection  of  shots  onto  any  axis  through  the 
bull's  eye  follows  the  normal  distribution.  In  two  dimensions  the  re¬ 
sulting  distribution  is  the  Rayleigh.  If  the  process  being  modeled  in¬ 
volves  reliability,  the  exponential  distribution  reflects  the  behavior  of 
an  item  with  a  constant  failure  rate.  If  the  process  involves  waiting 
or  queueing  phenomena,  the  exponential  can  be  used  to  depict  random 
arrival  and  service  times.  The  gamma  distribution  also  has  wide 
application  since  it  is  related  to  the  exponential  distribution.  The 
number  of  occurences  up  to  a  given  point  in  time  has  a  gamma  distri¬ 
bution  if  the  time  between  occurrences  follows  an  exponential  distribution. 

In  some  cases,  it  will  not  be  possible  to  relate  the  process  be¬ 
ing  examined  to  anything  which  is  known.  This  may  be  either  because 
little  understanding  of  the  process  exists  or  it  simply  bears  no  relation 
to  any  process  whose  behavior  can  be  described  on  a  theoretical  basis. 
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However,  there  still  nay  be  some  clues  which  are  useful  in  identifying 
an  applicable  distribution,  particularly  where  some  data  exist.  A  num¬ 
ber  of  qualitative  aspects  of  the  process  can  be  helpful.  These  include, 
for  example,  consideration  of  whether  the  variable  is  discrete  or  con¬ 
tinuous,  bounded,  symmetric,  or  can  be  described  in  some  other  sim¬ 
ilar  ways.  Such  clues,  although  probably  not  sufficient  for  positive 
identification  above,  are  useful  in  making  a  rational  selection  of  a 
distribution. 

2. 3  QUANTITATIVE  BASIS  FOR  SELECTION 

One  of  the  most  common  problems  in  simulation  is  not  having, 
or  not  being  able  to  obtain,  the  data  necessary  to  describe  a  particular 
variable.  Collecting  it  may  be  too  time  consuming  or  expensive.  In 
some  cases  it  is  simply  not  possible.  Consequently,  the  amount  of  data 
available  is  one  of  the  major  considerations  in  the  selection  of  prob¬ 
ability  distributions. 

Where  sufficient  data  are  available,  an  empirical  approach 
can  be  used.  This  means  essentially  using  the  data  to  derive  a 
model.  Combined  with  the  state  of  knowledge  of  the  process  being 
modeled,  graphical  and  analytical  techniques  can  be  employed  to 
select  the  distribution  most  representative  of  the  data. 

In  those  cases  where  acquisition  of  the  data  is  difficult,  the 
application  of  the  methodology  of  Part  II  can  be  useful  in  determin¬ 
ing  whether  such  effort  is  warranted.  If  a  distribution  can,  in  fact, 
be  selected  with  little  data,  there  may  be  no  justification  for  collect¬ 
ing  more.  If,  on  the  other  hand,  a  distribution  cannot  be  identified 
and  the  simulation  results  are  sensitive  to  that  particular  variable, 
additional  data  may  be  essential  for  developing  a  valid  model. 
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3.  TECHNIQUES  USED  IN  DISTRIBUTION  SELECTION 


Specific  techniques  for  selecting  a  particular  stochastic  model 
depend  on  the  information  and  data  available.  The  situation  can  range 
from  having  practically  nothing  to  work  with  to  almost  certain  specifica¬ 
tion  of  the  model  based  on  sound  theoretical  and  empirical  evidence. 

The  development  of  the  theoretical  evidence  is  entirely  qualitative. 
Development  of  the  empirical  evidence  requires  the  use  of  a  number  of 
quantitative  methods.  These  include: 

•  Sensitivity  analysis 

•  Graphical  analysis 

•  Parameter  estimation 

•  Goodness- af-fit-tewting. 

Each  of  these  is  introduced  briefly  in  the  following  sections. 

3 . 1  SENSITIVITY  ANA  L  YSIS 

The  purpose  of  sensitivity  analysis  is  to  determine  the  extent 
to  which  the  outcome  of  an  analysis  is  dependent  upon  a  particular 
variable  or  assumption.  It  is  particularly  applicable  in  simulation 
where  little  or  no  data  is  available  to  characterize  some  random  var¬ 
iables.  In  such  a  situation,  sensitivity  analysis  can  indicate  whether 
or  not  the  behavior  of  the  variable  must  be  more  accurately  known. 

If,  for  instance,  the  outcome  of  the  simulation  is  not  sensitive  to  the 
variable,  no  further  effort  to  characterize  it  is  necessary.  However, 
if  it  does  prove  sensitive,  an  attempt  to  develop  an  accurate  distribu¬ 
tion  model  is  warranted. 

The  only  practical  way  to  perform  the  sensitivity  analysis  is 
to  perform  a  simulation  varying  the  values  or  assumptions  concerning 
the  variable  in  question.  Comparison  of  the  results  using  standard 
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statistical  tests  can  reveal  whether  significant  differences  are  pro¬ 
duced  (see  Sections  3. 4  and  9.).  This  is,  not  so  formidable  a  task  as 
it  might  at  first  appear.  If  the  simulation  is  to  have  any  real  validity  '  J 
in  the  first  place,  the  behavior  of  most  of  the  variables  must  be  knowi^  > 
If  only  a  few  variables  can  be  accurately  described,  a  simulation  1 
merely  produces  a  precise  but  inaccurate  result. 

3. 2  GRAPHICA  L  ANA  LYSIS 

One  of  the  topics  in  elementary  applied  statistics  is  the  con* 
struction  of  frequency  histograms  and  cumulative  frequency  polygons. 
These  procedures  provide  one  means  for  identifying  appropriate  dis¬ 
tribution  models  under  the  proper  circumstances.  Where  such  tech¬ 
niques  are  applicable  they  do  offer  the  advantage  of  relative  simplicity. 
They  are  most  useful  when  there  is  some  knowledge  of  the  process  and 
at  least  minimal  data  available. 

The  histogram  is  constructed  from  data  concerning  the  vari¬ 
able.  It  carries  with  it  all  the  present  empirical  information  available 
on  the  variable,  nothing  more.  It  does  not  try  to  estimate  probable  be¬ 
havior.  If  rare  events  have  not  been  observed,  for  instance,  it  will 
assign  zero  probability  to  their  occurrence.  Since  it  uses  all  data,  it 
also  perpetuates  the  mistakes  of  erroneous  observations  and  may 
describe  a  model  that  is  not  valid. 

The  most  common  graphical  procedure  is  the  construction  of 
the  frequency  histogram.  This  is  simply  a  plot  of  the  frequency  with 
which  each  of  various  values  occurs  in  the  sample  data.  The  histo¬ 
gram  is  useful  in  two  ways.  It  provides  visual  evidence  of  the  shape 
of  the  distribution  which  can  be  useful  in  selecting  a  distribution.  It  may 
also  be  used  directly  in  the  simulation  as  the  model  of  the  process. 
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When  data  is  abundant  the  use  of  the  histogram  is  often  adequate 
for  many  Monte  Carlo  applications.  In  using  the  histogram,  care  must 
always  be  exercised  to  remove  obvious  errors  and  to  consider  low 
probability  events.  When  only  limited  data  is  available  the  histogram 
approach  suffers  from  sampling  peculiarities  and  from  lack  of  observa¬ 
tions  in  any  tails  of  the  distribution.  In  this  case  more  effective  distri¬ 
butions  can  be  developed  by  taking  into  consideration  other  informa¬ 
tion  about  the  behavior  of  the  variable  or  by  obtaining  additional  infor¬ 
mation  from  the  data,  e.  g. ,  by  estimating  higher  moments.  This 
information  can  range  from  an  understanding  of  the  theoretical  nature 
of  the  variable  to  intuition.  It  might  be  assumed,  for  example,  that 
the  underlying  real  distribution  is  continuous;  then  smoothing  proce¬ 
dures  can  be  applied  to  the  histogram  to  obtain  a  continuous  curve. 

Another  graphical  procedure  useful  in  the  selection  of  proba¬ 
bility  distributions  involves  the  use  of  probability  paper.  As  with  the 
histogram,  there  is  a  large  element  of  subjectivity  in  this  procedure. 

It  involves  selection  of  an  appropriate  probability  paper  from  those  avail 
able  and  plotting  the  sample  distribution  function.  Judgment  is  required 
in  deciding  whether  the  plot  sufficiently  approximates  a  straight  line. 

The  use  of  graphical  procedures  in  simulation  development 
is  described  in  Section  6,  Part  n. 

3.3  PARAMETER  ESTIMATION 

A  parametric  distribution  is  defined  to  be  a  functional  or 
analytical  representation  for  a  probability  distribution  which  depends 
on  one  or  more  parameters.  Although  use  of  such  distributions  re¬ 
quires  that  the  parameter(s)  be  estimated,  there  are  a  number  of 
reasons  for  using  a  parametric  distribution  function  rather  than  a 
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histogram  in  developing  a  mathematical  model.  In  particular,  a  parame 
trie  distribution: 


•  Provides  a  convenient  means  for  inclusion  of  additional 
information  about  the  variable  (such  as  known  upper  and 
lower  limits  on  the  data). 

•  Allows  meaningful  extrapolation  into  the  tail(s)  of  the 
distribution  and  into  regions  where  no  data  was  available. 

•  Allows  incorporation  of  the  additional  information  inher¬ 
ent  in  the  shape  of  the  distribution  if  there  is  a  theoretical 
justification. 

•  Provides  for  a  reproducible  means  of  representing  the 
data  since  freehand  "fit"  to  the  same  data  will  vary  from 
person  to  person. 

•  Provides  important  summary  information  about  the  vari¬ 
able  in  the  form  of  estimated  parameters  Ol  the  fitted 
distribution. 

•  Provides  a  more  compact  representation  of  the  random 
variable  usually  resulting  in  less  data  storage  requirements. 

•  Allows  construction  of  reasonable  and  convenient  models 
in  cases  of  no  data  or  very  limited  data. 

•  Provides  for  efficient  and  convenient  random  number  gen¬ 
eration  in  most  cases. 

•  Facilitates  analytic  (rather  than  simulation)  studies  of 
portions  of  the  process. 

•  Permits  a  convenient  means  whereby  analysis  of  the  sen¬ 
sitivity  to  the  shape  of  the  distribution  can  be  accomplished. 


To  facilitate  the  presentation  of  parametric  distributions,  the 
individual  parametric  families  have  been  classified  as  being  either  of 
a  simple  or  of  a  complex  nature.  The  difference  between  these  two 
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classifications  is  mainly  the  number  of  parameters  necessary  to 
describe  the  distribution.  The  simple  distributions  are  character¬ 
ized  by  no  more  than  two  parameters,  the  complex  by  more  than  two. 

The  other  distinguishing  feature  is  that  simple  distributions 
are  those  which  are  commonly  encountered,  relatively  easy  to  recog¬ 
nize,  and  have  some  theoretical  basis  for  their  functional  form  and 
application.  Thus,  simple  parametric  families  of  distributions  can 
often  be  derived  from  assumptions  about  the  process  generating  the 
random  variable  or  from  graphical  evidence  based  on  the  data. 

The  complex  parametric  families  generally  do  not  have  a 
"nice"  physical  interpretation  or  a  simple  functional  form.  They 
can  be  viewed  more  as  abstract  inventions  which  admit  enough  shapes 
to  insure  a  reasonable  fit  to  any  set  of  observations.  They  also  pro¬ 
vide  greater  flexibility  than  simple  distributions  in  projecting  events 
of  the  process  that  would  appear  in  the  tails  of  the  distribution. 

3.3.1  Simple  Parametric  Distributions 

The  simple  distributions  include,  but  are  not  limited  to,  the 
normal,  gamma,  binomial,  exponential,  and  other  distributions  which 
can  be  defined  by  at  most  two  parameters.  For  the  purposes  of  select¬ 
ing  an  appropriate  probability  model,  a  simple  distribution  will  be  in¬ 
dicated  by  the  underlying  theory  of  the  process  or  by  preliminary  selec¬ 
tion  using  graphical  procedures  referred  to  previously. 

One  of  the  most  common  and  useful  of  the  simple  continuous 

probability  functions  is  the  normal  distribution.  Much  of  the  appeal 

of  this  distribution  is  based  on  a  the  central  limit  theorem.  In  essence, 

this  states  that  the  sum  of  independent  variables  tends  to  be  normally 
(27) 

districted. v  '  This  assumes,  of  course,  that  none  of  the  individual 
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elements  of  the  sum  dominates  its  behavior.  Since  many  variables  which 
are  modeled  in  Monte  Carlo  simulations  are  in  reality  derived  from 
several  variables,  the  assumption  of  a  normal  distribution  can  often  be 
justified. 

Since  simple  parametric  distributions  are  discussed  in  detail  in 
most  elementary  textbooks  on  probability,  they  are  not  discussed  in  de¬ 
tail  here.  However,  a  summary  of  the  more  common  simple  paramet¬ 
ric  distributions  is  given  in  Section  4.3. 

3.3.2  Complex  Parametric  Distributions 

As  used  in  this  volume,  complex  parametric  distributions  are 
defined  as  the  Weibull,  Johnson,  and  Pearson  distribution  families. 

The  functional  form  of  these  distributions  is  somewhat  complicated, 
and  three  to  five  parameters  are  often  required  to  define  the  specific  dis¬ 
tribution.  Reverting  to  the  analytic  procedures  to  generate  these  dis¬ 
tributions  is  most  necessary  when  a  simple  distribution  cannot  be  jus¬ 
tified  and  the  simulation  results  are  dependent  upon  rare  events. 

Rare  events  are  usually  related  to  the  tails  at  the  distribution.  For 
certain  events  or  processes  to  be  simulated  sufficient  observations 
to  accurately  define  the  tail  regions  may  not  exist.  In  such  cases, 
one  usually  employs  smoothing  techniques  utilizing  parametric  func¬ 
tions  to  extend  or  infer  the  behavior  of  the  tail  regions  from  available 
data. 

Using  a  complex  parametric  distribution  can  be  viewed  as  a 
convenient  way  of  smoothing  the  raw  data  and  expressing  the  smoothed 
data  in  functional  form.  These  three  families  admit  almost  every  type 
of  probability  distribution,  one  major  exception  being  composite  dis¬ 
tributions  made  up  of  several  distinct  populations,  e.  g. ,  multimodal 
distributions.  In  fact  most  of  the  simple  parametric  distributions  are 
special  cases  of  a  Weibull,  Johnson,  or  Pearson  distribution. 
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If  the  reader  is  interested  in  a  further  discussion  of  these  dis¬ 
tributions,  background  information  is  contained  in  Appendix  A.  The 
material  there  is  not,  however,  essential  for  understanding  the  prin¬ 
ciples  discussed  in  Part  I  or  the  methods  described  in  Part  II. 

3. 4  GOODNESS-OF-FIT  TESTS 

After  initial  selections  of  a  distribution  for  a  Monte  Carlo 
application  and  where  sample  data  are  available,  it  is  usually  worth¬ 
while  to  try  and  validate  or  substantiate  these  choices.  The  validation 
step  of  the  selection  procedure  is  especially  critical  when  it  has  been 
determined  that  the  Monte  Carlo  result  will  be  sensitive  to  distribution 
selection.  More  generally,  developing  confidence  in  the  distributions 
used  in  any  simulation  adds  to  the  confidence  in  the  total  simulation  in 
addition  to  aiding  in  the  overall  understanding  of  the  process. 

One  of  the  most  useful  methods  used  in  validation  is  called 
goodness-of-fit-tests.  These  are  statistical  procedures  for  testing 
whether  sample  data  can  reasonably  be  expected  to  be  representative 
of  (drawn  from)  a  particular  probability  distribution.  Essentially, 
there  are  two  such  tests  which  have  found  wide  application  since  they 
can  be  applied  to  any  distribution.  These  are  the  Chi-Square  test  and 
the  Kolmogorov -Smirnov  test.  A  brief  description  of  each  of  these  two 
tests  is  presented  below.  In  addition  there  are  a  number  of  specialized 
tests  such  as  the  W-test  for  a  normal  distribution  and  the  WE -test  for 
an  exponential  distribution  which  are  useful.  Specific  details  for  apply¬ 
ing  these  tests  are  contained  in  Part  n,  Section  9. 

One  word  of  caution  should  be  noted  in  using  these  tests.  The 
statistical  inferences  based  on  these  tests  rely  on  asymptotic  proper¬ 
ties.  Thus  a  fair  amount  of  data  is  required  to  obtain  valid  interpre¬ 
tations.  Where  limited  data  are  available  or  many  erroneous  data 
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points  are  believed  to  be  in  the  sample,  the  usefulness  of  these  tests 
may  be  questionable. 


Chi-Square  Test:  This  common  goodness-of-fit-test  is  made  by 
subdividing  the  data  into  groups  or  intervals  and  comparing  the  num- 

1L 

ber  of  actual  observations  in  the  i  interval  to  the  number  expected 
as  computed  from  the  assumed  distribution.  The  statistic  employed  in 
this  method  is 


2 

xn-l 


n 

■5 


(Ai "  Ei}' 

eT 


Under  the  null  hypothesis  (observations  are  from  the  assumed  distribution) 
the  distribution  of  this  statistic  asymptotically  approaches  a  Chi-Square 
distribution  with  n-1  degrees  of  freedom. 

The  Chi-Square  test  has  certain  obvious  shortcomings.  In  addi¬ 
tion  to  being  sensitive  to  sample  size,  this  test  is  also  sensitive  to  data 
grouping.  Different  investigators  conducting  this  test  will  tend  to  get 
different  results.  One  requirement  in  using  the  test  is  that  each  cell 
or  subgroup  should  have  a  sufficient  number  of  observations  in  it. 

Some  authors  (27)  feel  that  a  good  test  requires  at  least  twenty  obser¬ 
vations  per  cell  and  that  there  should  also  be  between  five  and  twenty 
cells. 

Kolmogorov -Smirnov  Test:*  This  goodness-of-fit  test  is  made 
by  computing  the  maximum  difference  between  the  sample  cumulative 
distribution  function  and  the  assumed  distribution  function.  This  dif¬ 
ference,  under  the  null  hypothesis,  has  a  known  asymptotic  distribu¬ 
tion  which  is  available  in  table  form  (see  Appendix  B).  The  Kolmogorov- 
Smirnov  is  generally  considered  to  be  more  sensitive  than  the  Chi-Square 
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test  and  also  has  the  advantage  that  arbitrary  data  grouping  decisions 
are  not  required.  Its  disadvantages  are  that  it  is  usually  more  com¬ 
putationally  difficult  to  apply,  and  if  the  hypothesis  is  rejected,  the 
reason  for  the  rejection  is  less  clear. 
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PART  n 


SELECTION  OF  DISTRIBUTIONS 
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4.  DISTRIBUTION  SELECTION  PROCEDURES 


This  section  presents  a  systematic  set  of  procedures  for  selecting 
the  most  representative  model  for  a  random  variable  in  a  simulation. 

The  procedures  selected  depend  on  too  types  of  knowledge  of  the  random 
variable  in  question.  These  are: 

1.  Empirical  Data  (Quantitative  Observations) 

2.  Understanding  of  the  Random  Process  (Qualitative  A  Priori 
Knowledge). 

Based  on  the  degree  of  knowledge  in  each  category,  a  set  of  procedures 
for  selecting  a  distribution  has  been  constructed.  By  following  a  particu¬ 
lar  procedure  the  most  appropriate  probability  model  can  be  easily 
selected. 

The  initial  discussion  in  this  section  is  devoted  to  a  discussion  of 
selecting  the  appropriate  procedure  to  be  used  based  on  the  degree  of 
available  knowledge  of  the  random  variable  in  question.  Secondly,  this 
section  is  devoted  to  presenting  a  brief  guide  to  using  the  remaining  sec¬ 
tions  of  Part  n.  This  section  is  concluded  with  a  table  listing  all  the 
candidate  distributions  considered  here.  This  table  also  summarizes  the 
characteristics  of  these  distributions.  The  rest  of  Part  n  is  concerned  with 
how  one  performs  the  specific  operations  which  lead  to  selection  of  the 
appropriate  probability  distribution  model. 

4. 1  PROCEDURES  FOR  SELECTING  DISTRIBUTIONS 

The  particular  selection  procedure  for  a  probability  model  is  de¬ 
termined  by  the  extent  of  empirical  data  and  knowledge  of  the  random 
process  in  question.  The  extent  of  empirical  data  can,  for  convenience, 
be  broken  into  three  categories:  none,  some,  and  ample.  This  cate¬ 
gorization  is  given  in  Table  4. 1. 
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TABLE  4. 1 

Extent  of  Empirical  Data  (Observations) 


Category 

1 

2 

3 

Description 

none 

some 

ample 

Number  of 
Observations 

5-20 

over  20 

The  extent  of  knowledge  of  the  random  process  is,  for  conveni¬ 
ence,  broken  into  four  categories:  no  knowledge,  qualitative  knowledge, 
reasonably  good  ideas,  and  reasonable  certainty.  These  categories 
are  described  further  in  Table  4.2.  It  should  be  clear  that  the  more 
data  and  the  greater  the  a  priori  qualitative  knowledge  available,  the 
easier  the  selection  process  is  and  the  greater  the  certainty  of  obtain¬ 
ing  a  good  probability  model. 

TABLE  4. 2 

Extent  of  Qualitative  Knowledge  of  the  Random  Process 


Category 

1 

2 

3 

4 

None: 

Qualitative: 

Good  ideas: 

Reasonable 

certainty: 

Description 

No 

qualitative 
knowledge 
of  the 
random 
process 

Some 

knowledge  of 
the  random 
process,  i.  e. 
continuity, 
range, 
symmetry, 
shape  of 
distribution, 
likely  values, 
etc. 

Reasonably 

based 

expectations 
that  the 
random 
variable  is 
one  of  a  few 
known 
families 

Good  basis 
for  expect¬ 
ing  the  dis¬ 
tribution  to 
be  some 
known 
family 
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A  concerted  effort  should  be  made  to  use  all  a  priori  knowledge. 
This  means  that  all  the  qualitative  characteristics  listed  under  Category 
2  in  Table  4. 2  should  be  written  down,  if  known.  This  will  also  help 
in  sketching  a  probability  density  or  frequency  curve.  Table  4.3  should 
also  be  consulted  to  determine  if  Categories  3  or  4  are  appropriate. 

Table  4.3  lists  all  of  the  probability  distributions  considered  here.  These 
are  arranged  in  two  groups,  the  simple  parametric  distributions  and  the 
complex  parametric  distributions.  This  table  also  summarizes  the 
characteristics  of  these  distributions.  Table  4.3  is  very  useful  as  a 
reference  in  selecting  a  probability  distribution  since  almost  all  of  the 
information  needed  for  selection  is  presented.  To  this  end,  therefore, 
the  columns  in  Table  4.3  entitled  Comments  and  Justification  and  Applic¬ 
ations  may  give  characteristics  that  fit  the  problem  at  hand.  Any 
distributions  that  appear  appropriate  should  be  listed  so  that  knowledge 
at  a  level  of  Category  3  or  4  can  be  used. 

Once  the  categories  for  empirical  data  and  knowledge  of  the 
random  process  have  been  established  from  Tables  4.1  and  4.2,  a  specific 
selection  procedure  can  be  identified  from  Table  4.4.  Table  4.4  is 
simply  a  matrix  indicating  all  possible  combinations  of  data  and  knowledge 
categories.  For  each  combination,  a  figure  number  is  indicated.  Each 
figure  presents  the  details  of  the  particular  selection  procedure  that  it 
represents. 

A  discussion  of  the  selection  procedures  presented  in  Figure  4. 1- 
4. 12  and  how  that  material  is  used  is  contained  in  the  following  section 
(4.2). 
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TABLE  4.  3  (Continued) 


TABLE  4. 3  (Continued) 
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TABLE  4.3  (Continued) 
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TABLE  4.3 

Sequence  of  Activity  Selection  (By  Figure  Number) 


Knowledge  of  Random  Process  Category 


1 

2 

3 

4 

Figure 

4.1 

Figure 

4.2 

Figure 

4.3 

Figure 

4.4 

eg 

Figure 

4.5 

Figure 

4.6 

Figure 

4.7 

Figure 

4.8 

CO 

Figure 

4.9 

Figure 

4.10 

Figure 

4.11 

Figure 

4.12 

4. 2  SELECTION  TECHNIQUES 

The  following  list  provides  a  brief  description  of  each  selection 
technique  used  in  the  selection  procedures  and  provides  the  location  of 
further  detailed  discussion. 


Sensitivity  Analysis 
(Section  5.) 


Graphical  Analysis  - 
(Section  6. ) 


Involves  performing  the  simulation  study 
using  several  differ  mt  distributional 
assumptions  or  parameters  to  examine  the 
effect  it  has  on  the  final  results. 

Involves  plotting  a  histogram  and/or  using 
probability  paper  to  judge  what  distributions 
appear  likely.  This  analysis  may  reject 
some  ideas  as  inappropriate  or  suggest 
several  likely  distributions.  This  analysis 
applies  primarily  to  the  simple  or  common 
distributions. 


Analytic  Curve  Fitting 
(Section  7. ) 


Refers  to  fitting  the  data  to  one  or  more  of 

the  complex  or  uncommon  distributions  such  as  the 

Weibull,  Johnson,  and  Pearson. 


Parameter  Estimation  - 
(Section  8. ) 


Is  the  task  of  estimating  the  values  of  the 
parameters  of  a  given  distribution  family 
to  obtain  the  best  fit  with  the  data. 
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Goodness -of -Fit 
(Section  9. ) 


Histogram  - 
(Section  6. ) 


Tests  are  used  to  determine  if  the  candi¬ 
date  distribution  is  an  adequate  represen¬ 
tation  of  the  actual  random  process  based 
on  the  data  available. 

If  all  likely  distributions  fail  the  goodness  - 
af-fit  tests  fail,  a  histogram  should  be  used. 


These  techniques  can  best  be  applied  by  referring  to  the  appro¬ 
priate  section.  After  app'>ation  of  any  technique,  refer  to  the  appropriate 
figure  to  determine  subsequent  selection  techniques  to  employ,  if  any. 


Figure  4. 1  Figure  4. 2 

No  Data,  No  Knowledge  No  Data,  Qualitative  Knowledge 


Graphical 
Analysis 
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Sensitivity 
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Sensitivity 
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Figure  4. 3  Figure  4. 4 

No  Data,  Good  Knowledge  No  Data,  Certain  Knowledge 


Parameter 

Estimation 

(Arbitrary 

Parameter 

Selection) 

Parameter 

Estimation 

(Arbitrary 

Parameter 

Selection) 
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1 

Sensitivity 

Sensitivity 

Analysis 
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Figure  4.5 

Some  Data,  No  Knowledge 


Figure  4.6 

Some  Data,  Qualitative  Knowledge 


Figure  4.7 

Some  Data,  Good  Knowledge 


Figure  4. 8 

Some  Data,  Certain  Knowledge 
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Figure  4. 9 

Ample  Data,  No  Knowledge 


Figure  4. 10 

Ample  Data,  Qualitative  Knowledge 


Accept 


Accept 
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Figure  4.11 

Ample  Data,  Good  Knowledge 


Figure  4. 12 

Ample  Data,  Certain  Knowledge 


5.  SENSITIVITY  ANALYSIS 


The  objective  of  sensitivity  analysis  is  to  determine  the  extent  to 
which  the  final  results  of  the  simulation  study  are  sensitive  to  a  given 
probability  distribution.  To  this  end  two  general  guidelines  can  be  given. 

The  first  is  to  attain  a  determination  of  sensitivity  to  the  parame¬ 
ters  of  a  distribution.  It  might  be  reasonable  to  vary  the  parameters  to 
some  extent  in  both  directions.  Suppose,  for  example,  that  a  normal  dis¬ 
tribution  with  mean  100  and  standard  deviation  20  is  postulated.  Then 
five  runs  might  be  made  to  test  sensitivity  of  the  final  simulation  results 
to  these  parameters  as  follows  [(mean,  standard  deviation)]:  (100,  20), 
(110,  20),  (90  ,  20),  (100,  18),  (100  ,  22). 

A  second  sensitivity  test  that  can  be  performed  is  one  of  shape 
of  parametric  family:  it  may  be  reasonable  to  make  several  simulations 
with  different  probability  distributions,  especially  if  unlikely  events  are 
important  to  the  simulation  results.  In  this  case  the  shape  of  the  tail  of 
the  distribution  is  important.  Suppose,  for  example,  that  a  gamma  dis¬ 
tribution  has  been  chosen;  then  a  lognormal  or  Weibull  might  also  be  tried, 
since  these  have  similar  shapes. 


37 


6.  GRAPHICAL  TECHNIQUES 


There  are  two  graphical  techniques  that  are  applicable  here. 
The  first  deals  with  the  empirical  histogram  and  the  second  deals 
ing  with  the  empirical  cumulative  distribution  polygon.  Both  tech¬ 
niques  can  be  quite  useful  in  selecting  a  good  functional  fit  to  data. 
These  graphical  techniques  are  intended  primarily  for  use  in  select¬ 
ing  one  of  the  common  or  simple  distributions.  Although  graphical 
techniques  can  be  helpful  in  the  selection  of  a  complex  distribution, 
this  is  discussed  as  analytical  curve  fitting  in  Section  7. 

Graphical  techniques  can  often  suffice  to  determine  a  satis¬ 
factory  probability  model  for  a  simulation  variable.  This  is  especi¬ 
ally  true  if  the  simulation  results  are  not  sensitive  to  rare  events  of 
the  several  random  variables.  An  example  is  given  in  Section  6. 3  to 
illustrate  the  histogram  and  cumulative  distribution  polygon  methods. 

6. 1  USING  THE  EMPIRICA  L  HISTOGRA  M 

The  empirical  histogram  can  be  used  to  determine  what  dis¬ 
tributions  are  likely  to  fit  a  given  set  of  data.  This  can  best  be 
accomplished  by  a  visual  comparison  to  find  curves  representing 
probability  distributions  that  are  similar  to  the  data.  The  approach 
taken  in  this  section  is  to  find  such  visual  fits  by  examining  a  series 
of  figures  representing  the  density  function  of  most  of  the  simple 
distributions. 

The  procedure  is  very  straightforward.  First  plot  the  histo¬ 
gram  from  the  data  available.  In  some  cases  it  may  be  helpful  to 
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sketch  a  smoothed  version  o f  the  histogram,  especially  if  the  cells 
of  the  observation  groupings  are  large  or  the  data  are  few.  Then  ex** 
amine  the  shapes  given  in  Figure  6. 1  and  select  those  distributions 
whose  densities  are  similar  to  the  histogram.  (Figure  6. 1  does  not 
include  the  Weibull,  Johnson,  or  Pearson  distributions.  For  these 
distributions,  see  Section  7. )  It  is  also  useful  to  rank  the  selections 
according  to  how  good  the  fit  is. 

6. 2  USING  THE  EMPIRICA  L  CUMULATIVE  DISTRIBUTION 

POLYGON 

An  alternate  technique  is  to  use  the  cumulative  distribution 
polygon  in  conjunction  with  probability  paper.  The  horizontal  axis  of 
this  paper  represents  the  values  of  the  variable  under  investigation; 
the  vertical  axis  is  a  probability  scale.  The  spacing  on  the  vertical 
axis  is  constructed  for  a  given  probability  family  so  that  a  cumulative 
distribution  function  belonging  to  that  family  will  appear  as  a  straight 
line  on  the  paper. 

The  graphical  method  is  quite  general  and  can  be  applied  to 
any  known  distribution;  however,  the  probability  paper  which  is  com¬ 
mercially  available  is  limited  to  the  more  commonly  encountered  dis¬ 
tributions  such  as  the  normal  (see  Figure  6.2),  lognormal,  extreme 
value,  chi-square,  gamma,  binomial,  and  Weibull.  * 

The  procedure  for  using  this  graphical  method  is  extremely 
simple  although  interpretation  of  the  results  is  somewhat  subjective. 
The  sample  cumulative  distribution  is  plotted  on  the  probability  paper 
corresponding  to  the  theoretical  distribution  of  interest.  If  the  points 


*See,  for  example,  TEAM  Special  Purpose  Graph  Papers,  Bax  25, 
Tamworth,  N.  H.  03886,  also  K+E  papers. 
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of  3) 
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Fig.  6.1.  Shape*  at  simple  parametric  distributions  (Sheet  3  at  3) 


Cumulative  Probability  (Percent  of  observations  with  value  less  than  value  of  variable  indicated) 
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Value  of  Variable  Scale 
Fig.  6.2.  Normal  probability  paper 


fall  on  a  straight  line  the  theoretical  distribution  is  accepted  as  rep¬ 
resentative  of  the  data.  If  the  line  is  badly  curved,  other  distributions 
can  be  tried.  The  nature  of  the  curve  often  suggests  distributions 
which  might  be  of  better  fit. 

Another  useful  aspect  of  the  graphical  procedure  is  that  esti¬ 
mates  of  the  distribution's  parameters  can  be  read  directly  off  the 
graph.  For  example,  on  normal  probability  paper,  the  difference 
in  variable  value  between  the  .  50  probability  point  and  the  .  84  prob¬ 
ability  point  on  the  fitted  line  corresponds  to  one  standard  deviation. 

6.3  NUMERICAL  EXAMPLE 

An  example  will  illustrate  the  use  of  these  techniques.  The 
data  for  the  example  is  given  in  Table  6.1.  Observations  ranging 
from  66. 75  to  75. 25  have  been  divided  into  seventeen  equal  inter¬ 
vals  or  cells  of  0. 50  each.  The  frequency  with  which  observations 
fall  within  each  cell  has  been  tabulated  and  summarized.  This  data 
was  then  plotted  in  Figure  6. 3  to  produce  what  is  generally  referred 
to  as  a  histogram. 

The  histogram  serves  two  purposes.  First,  it  provides  vis¬ 
ual  evidence  on  which  to  base  preliminary  selection  of  a  distribution. 
Second,  in  the  case  of  limited  data,  it  may  provide  as  good  an  esti¬ 
mate  of  the  variability  of  the  process  as  any  other  more  elaborate 
approach. 

On  the  basis  of  its  symmetry  and  bell  shape,  the  histogram 
of  Figure  6. 3  appears  .ypical  of  data  from  a  normal  distribution. 
Making  an  assumption  of  normality,  it  is  possible  to  proceed  to  the 
application  of  other  quantitative  methods  to  determine  its  validity. 
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TABLE  6. 1 
Sample  Data 


Cell 

Boundaries 

Frequency 

Relative 

Frequency 

Cumulative 

Frequency 

Cumulative 

Relative 

Frequency 

66.75-67.25 

2 

0.005 

2 

0.005 

67.25-67.75 

2 

0.005 

4 

0.011 

67. 75-68. 25 

5 

0.014 

9 

68. 25-68.75 

6 

0.016 

15 

0.041 

63. 75-69. 25 

7 

0.019 

22 

0.060 

69.25-69.75 

24 

0.066 

46 

0. 126 

69.75-70.25 

36 

0.099 

82 

0. 225 

70.25-70.75 

48 

0.132 

130 

0. 357 

70. 75-71.25 

64 

0. 176 

194 

0. 533 

71.25-71.75 

51 

0. 140 

245 

0.673 

71.75-72.25 

41 

0.113 

286 

0.786 

72.25-72.75 

32 

0.088 

318 

0.874 

72.75-73.25 

24 

0.066 

342 

0.940 

73.25-73.75 

12 

0.033 

354 

0.973 

73.75-74.25 

5 

0.014 

359 

0.986 

74.25-74.75 

4 

0.011 

363 

0.997 

74.75-75.25 

1 

0.003 

364 
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The  data  given  in  Table  6. 1  can  also  be  plotted  on  normal  proba¬ 
bility  paper.  This  will  verify  the  assumption  of  a  normal  distribution  and 
also  give  the  appropriate  parameters  for  the  distribution  if  the  assumption 
of  normality  is  accepted.  The  cumulative  relative  frequency  (sample 
cumulative  distribution  function)  when  plotted  on  normal  probability  paper, 
shown  in  Fig.  6. 4,  turns  out  to  be  reasonably  linear.  Thus  it  can  be  con¬ 
cluded,  at  least  tentatively,  that  the  data  in  Table  6. 1  has  been  drawn  from 
a  normal  population.  For  many  applications  this  will  suffice  to  identify  a 
satisfactory  distribution.  Note  that  the  mean  (p)  and  the  standard  devia¬ 
tion  (7)  can  also  be  estimated  from  the  graph. 

Rather  than  go  through  the  process  of  grouping  the  data  into  class 
intervals  or  cells  as  in  Table  6. 1  one  can  plot  the  data  directly  onto  proba¬ 
bility  paper  in  the  following  way.  The  n  observations  x^,  Xg, . . . ,  xr  are 
placed  in  ascending  order  (ranked)  such  that: 

x(l)  5  *<2)S^3)S---SVl)  “*<»)• 

To  each  x^  associate  the  ordinate  value  y^  =  and  plot  the 

ordered  pairs  (x^ ,  y^)  on  the  probability  paper.  This  procedure  is 

extremely  fast,  with  the  exception  of  having  to  rank  the  n  observations. 

Therefore,  it  is  probably  most  useful  for  sample  sizes  in  the  range  1-50, 

depending  of  course  on  how  proficient  one  is  at  ranking  observations. 

Many  excellent  examples  of  the  use  of  probability  paper  for  extreme 

(14) 

value  distributions  may  be  found  in  Gumbel. 

This  example  is  concluded  with  a  visual  verification  of  the  selection 
of  a  normal  distribution  to  fit  the  data  in  Table  6. 1.  Figure  6. 5  gives  the 
same  information  as  Fig.  6. 3  with  the  addition  of  the  normal  density  curve 
scaled  to  the  frequency  polygon. 
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Value  of  Random  Variable 


Fig.  6.4  Cumulative  polygon  on  normal  probability  paper 
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Value  of  Random  Variable 


Fig.  6.  5  Comparison  of  Histogram  and  Normal  Distirbution 
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7.  ANALYTICAL  CURVE  FITTING 


Analytical  curve  fitting  encompasses  a  variety  of  techniques  to 
smooth  an  empirical  histogram  for  use.  As  discussed  in  Part  I,  the 
purpose  of  analytical  curve  fitting  is  to  obtain  a  reasonable  functional 
approximation  of  the  empirical  histogram  to  be  used  in  a  simulation. 

For  the  purposes  of  Part  n  of  this  volume,  analytical  curve  fitting 
will  be  restricted  to  the  use  of  three  families  of  probability  distributions. 
These  are  the  Weibull,  Johnson,  and  Pearson  distributions.  The  reader 
who  is  unfamiliar  with  these  distributions  may  wish  to  refer  to  Appendix  A 
to  find  a  background  discussion  of  these  three  distributions.  The  Weibull 
family  is  the  easiest  to  work  with  and  the  Pearson  family  is  the  most  dif¬ 
ficult  to  work  with.  It  is,  therefore,  recommended  that  analytical  curve 
fitting  be  tried  first  with  the  Weibull,  then  if  need  be  with  the  Johnson, 
and  finally  if  necessary  with  the  Pearson  distributions. 

The  procedure  for  selecting  one  or  more  of  these  families  is  based 
on  Table  7. 1.  The  use  of  Table  7. 1  is  facilitated  if  qualitative  information 
about  the  random  processes  and  a  sketch  of  the  probability  density  are  avail¬ 
able.  Once  one  or  more  families  have  been  chosen,  the  selection  procedure 
outlined  in  Section  4  should  be  followed. 

Since  using  the  Weibull,  Johnson,  or  Pearson  distribution  is  tanta¬ 
mount  to  using  a  smoothed  histogram,  some  consideration  should  be  given 
to  using  the  histogram  itself  rather  than  a  distribution.  This  is  especially 
true  if  the  histogram  is  drawn  from  an  ample  set  of  data,  if  the  Weibull, 
Johnson,  and  Pearson  curves  do  not  give  reasonably  good  fits,  or  if  the 
histogram  is  multimodal.  In  the  latter  case  the  underlying  population  may 
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actually  be  several  distinct  populations,  and  unless  the  user  is  prepared 
to  separate  that  population  by  techniques  not  discussed  here,  using  the 
histogram  may  be  tnost  expedient. 


TABLE  7. 1 

Characteristics  of  Complex  Probability  Curves 


Family  Name 

Number  of 
Parameters 

General 

Characteristics 

Figures  for 

Shapes  of  Densities 

Weibull 

3 

Unimodal,  finite  left  bound, 
tail  to  right 

Figure  7.1 

Johnson 

4 

(plus  choice 
of  three 
functions) 

Bounded  or  unbounded, 
variety  of  shapes, 
mostly  unimodal 

Figures  7.  2-7. 3 

Pearson 

up  to  4 
(plus  choice 
of  twelve 
functions) 

Great  variety  of  curves 

Figure  7.5 
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Fig.  7. 2.  Johnson  Probability  Density  Functions  for  Sy 
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TYPE  I 


type  vm  [ 


-*i° 


TYPE  n 


type  m  i  y 


-a  o 


TYPE  V 


TYPE  VI 


TYPE  IX 


TYPE  X 


TYPE  XI 


TYPE  XII 


Note:  Types  IV  and  VH  are  similar  to 
Normal  Distributions 


Fig.  7. 5.  Typical  Shapes  of  Pearson  Distributions 
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8.  PARAMETER  ESTIMATION 


Once  a  specific  type  from  a  family  of  probability  distributions  has 
been  tentatively  chosen  to  model  a  random  variable,  specific  parameters 
for  the  distribution  must  be  chosen.  These  parameters  should  be  chosen 
so  that  the  resulting  specific  distribution  will  best  fit  the  data  and  knowl¬ 
edge  available.  This  section  is  devoted  to  finding  the  specific  parameter 
values  based  on  the  empirical  data  (observations)  available. 

If  no  data  is  available,  the  parameters  must  be  chosen  arbitrarily. 

In  this  case  no  estimation  procedure  exists  that  is  better  than  the  analyst's 
intuition  and  judgment.  If  data  is  available,  the  parameters  can  be  estimated 
based  on  the  sample  of  data.  Estimates,  in  this  case,  always  begin  with 
calculation  of  certain  sample  statistics  which  are  give.  1  Section  8. 1. 

This  section  should  be  used  in  conjunction  with  the  directions  given  in 
Section  8. 2.  This  latter  section  gives  formulas  for  estimating  the  specific 
parameters  for  all  of  the  distributions  considered.  Since  not  all  the  sample 
statistics  in  Section  8. 1  are  needed  for  all  the  distributions  and  parame¬ 
ters  in  Section  8. 2,  Section  8.  2  should  be  referred  to  before  calculating 
sample  statistics . 

8. 1  CALCULATING  SAMPLE  STATISTICS 

The  sample  statistics  given  in  this  section  include  the  sample  mean, 
median,  variance,  skewness,  kurtosis,  3rd  moment,  and  4th  moment. 

To  establish  some  standard  notation,  we  define  the  following  symbols: 

n  =  number  of  data  points 

Xj  =  i  data  point  (observation)  for  i  =  1,  2, . . . ,  n  . 


Preceding  page  blank 


59 


The  sample  statistics  are  calculated  as  follows: 
Sample  Mean  (symbol  x) 


Sample  Median 

First  rank  the  observations  from  smallest  to  largest.  If  n  is  odd, 
the  median  is  given  by  the  value  of  the  [(n+l)/2]^  observation.  If  n  is 
even  the  median  is  given  by  the  mean  of  the  [n/2]th  and  [(n/2)  +  l]th 
observations. 

2 

Sample  Variance  (symbol  s  ) 


or,  more  conveniently 

-fe-H 

Sample  mth  Centralized  Moment  (symbol  4  m) (only  4^  and  4^  needed) 


Sample  Skewness  (symbol  0^) 
*1  *  M3/b3 
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Sample  Kurtosis  (symbol 


Interpretation  of  the  last  two  estimators  is  usually  in  terms  of  how  well  the 
data  fits  the  normal  distribution.  If  the  skewness  is  close  to  zero  and  the 
kurtosis  is  close  to  three  the  normal  distribution  should  provide  a  good 
approximation  to  the  distribution.  Figure  8. 1  gives  an  interpretation  of 
the  skewness  value.  Zero  indicates  a  symmetric  distribution,  negative 
skewness  means  a  long  left  tail,  positive  values  a  long  right  tail.  Figure  8. 2 
illustrates  the  kurtosis  measure.  If  the  kurtosis  is  greater  than  three  the 
distribution  is  more  peaked  than  the  normal  (curve  C).  If  it  is  less  than 
three  the  curve  is  flatter  than  the  normal  (curve  A). 

8. 2  CALCULATING  PARAMETER  ESTIMATES 

This  section  is  divided  into  two  parts.  Section  8. 2. 1  deals  with 
the  simple  distributions.  This  section  will  be  the  one  more  commonly 
used.  Section  8. 2. 2  is  more  complicated  and  deals  with  estimating  parame¬ 
ters  for  the  complex  distributions. 

8.2.1  Simple  Parametric  Distributions 

Refer  to  Table  4. 3  to  obtain  the  recommended  parameter  estimates 
for  the  selected  distribution.  Use  Section  8. 1  to  obtain  the  sample  statis¬ 
tics  required. 

8.2.2  Complex  Parametric  Distributions 

As  can  be  seen  in  Table  4.3 ,  estimating  parameters  for  the  Weibull, 
Johnson,  and  Pearson  distributions  is  more  involved  than  for  the  simple 
distributions.  The  reason  for  this  is  that  the  simple  distributions  generally 
have  one  or  two  parameters,  whereas  the  complex  distributions  have  3  to  5 
effective  parameters.  Background  for  the  material  which  follows  can  be 
found  in  Appendix  A. 
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Fig.  8.1.  Stewed  distribution* 


Fig.  8.2.  Three  frequency  eurrec  with 
different  degree*  of  kurtoeis 


x 


8. 2. 2.1  Weibull 

The  basic  three -parameter  Weibull  distribution  has  a  density  given 
by: 

f(x)  '  x(^ir)  e]q>['(Lr£)]  ’  x£f 

=  0  x  <  c 

where: 

f(x)  =  Weibull  probability  distribution 
c  =  location  parameter 
X  =  scale  parameter 
it  =  shape  parameter 

In  most  applications  the  location  parameter,  e,  is  known.  In 
cases  where  it  is  not  known,  it  can  be  estimated  from  the  observations: 

c  =  minfx^  . 

Better  estimates  of  €  can  be  obtained  using  techniques  developed  by  Dubey, 
however,  the  improvement  is  not  usually  sufficient  to  warrant  the  extra 
effort  involved. 

The  maximum  likelihood  estimators  for  the  three -parameter  Weibull 

distribution  result  in  a  set  of  equations  that  can  be  solved  by  iterative 

methods  which  are  very  tedious  to  perform.  If  the  location  parameter  is 

known  or  estimated,  the  maximum  likelihood  equations  for  X  and  r)  can 

(51) 

be  solved  fairly  easily'  '  and  are  given  by: 
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and 


X  =  (ExJVn)1^  (8.2) 

where: 

tj  =  Maximum  likelihood  estimator  of  tj 
X  =  Maximum  likelihood  estimator  of  X 
Equation  8. 1  can  be  solved  by  the  Newton-Raphson  iterative  procedure. 


Vi 


=  n. 


\ 


s, 


ek 

1  0 


l 


where: 


The  estimate  r?  is  biased  and  should  be  corrected  using  the  unbiasing  fac- 

A 

tors  in  Table  B-lof  Appendix  B.  Then,  the  estimate  for  X  can  be  obtained 
directly  from  (8. 2).  Further  improvement  can  be  obtained  by  using  Menon’s 
estimators. 
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8. 2. 2. 2  Johnson  Distributions 

9 

As  indicated  in  Table  4. 3,  there  are  three  Johnson  distributions.  These 
three  are  generally  denoted  S^,  Sfi,  and  because  these  distributions 
are  related  to  the  normal  distribution  through  a  logarithmic  transformation 
(SL),  bounded  transformation  (Sg),  and  unbounded  transformation 
The  problem  of  estimating  parameters  of  the  Johnson  distribution  thus  be¬ 
comes  a  two-step  procedure.  First  determine  which  distribution  to  use,  then 
estimate  the  appropriate  parameters. 

The  probability  density  functions  for  the  three  Johnson  distributions 

are: 


-®  <  x  <  • 


In  these  distributions  r\  and  y  are  shape  parameters,  X  is  a  scale  parame¬ 
ter,  and  c  is  a  location  parameter.  These  must  satisfy: 

r)  >0,  A>0,  -®<  y,  e  <  +  oo 

In  Section  8. 1,  expressions  are  given  for  the  skewness,  (3 j,  and 
kurtosis,  /Jg*  °*  sample  data.  These  are  used  to  determine  which 
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distribution,  SL,  S0,  or  to  use.  This  can  be  accomplished  by  plotting 
the  sample  and  02  on  Fl8-  8-  3.  The  location  of  the  sample  point 
Oj,  fi2)  indicates  the  distribution  to  select.  One  warning  must  be  given, 
however.  Figure  8. 3  is  accurate  for  categorizing  distributions  given  the 
true  value  of  ^  and  02*  The  values  for  and  02  derived  from  the 
sample  (Section  8. 1)  are  estimates  of  the  true  values.  Thus  if  the  sample 
point  falls  near  the  edge  of  a  region  in  Fig.  8.  3,  i.e. ,  near  the  SL  line, 
then  it  would  be  prudent  to  try  all  three  Johnson  distributions  or  to  select 
one  or  more  based  on  possible  boundedness  of  the  random  variable  in  ques¬ 
tion.  Examining  the  density  functions  given  above  will  aid  in  this 
determination. 


The  parameter  estimates  for  the  Johnson  distributions  are  given  be¬ 
low.  The  estimates  of  the  Johnson  parameters  are  not  maximum  likelihood 
estimates,  except  for  the  SL  ( t  known)  case,  however  they  are  the  most 
practical  to  use.  The  approach  taken  is  to  use  percentile  points  from  the 
data.  Recall  that  a  100  a  percentile  point  for  the  population,  xa ,  is  that 
value  of  x  for  which  P[x  s  xfl]  =  a.  We  assume  that  the  random  sample 

x, ,....,  x„  has  been  ordered  to  give  the  order  statistics  W*  <  . . .  <  W  . 
x  II  in 

Then  the  kth  order  statistic  will  provide  an  estimate  for  the  100<*  percentile 
of  the  population,  where: 


a  - 


k  -  1 
n  -  1 


(8.  3) 


This  will  be  required  in  subsequent  application,  8^  ( -  known).  In 

this  case  the  estimators  torrf  and  yare  respectively,  i-l/2 

|2 


V  = 


ff  £  [Mxj-O]2-  -i  £  ta<v,) 
i=l  L  i=l 


and 


InfXj  -  c) 


(8.4) 


Thus,  from  the  sample  x1f...,x  the  parameters  and  y  can  be  readily 

a  a  1  fi 

estimated  with  n  and  y,  respectively. 


SL  (c  unknown) 


Again,  the  maximum  likelihood  estimators  may  be  obtained  but  with  some 
difficulty,  and  it  is  perhaps  better  to  use  the  percentile  approach.  That  is  assume 


the  percentile  points  x 


a.  ' 


xfl  ,  and  xQ 


have  been  estimated.  These  are 


required  since  there  are1  three2paramete$s  t ,  ,  and  y  to  estimate.  If 

zfl  is  defined  as  the  value  of  the  variable  in  the  normal  distribution  function  cor¬ 
responding  to  the  cumulative  probability  a  ,  then. 


\  =  y  +  ri  ]n{xal  ‘ 

\  ■  y +  *>  ln<xa2  -  f> 

zft3  =  in<x03  - () 

Explicit  solutions  cannot  be  obtained  for  c  ,  y  ,  and  tj  from  these 
equations  although  they  can  be  determined  iteratively.  However,  the  following 
example  will  illustrate  the  use  of  one  simplification. 

Suppose  a  sample  size  of  n  =  51  has  been  obtained.  The  6th,  26th,  and 
46th  order  statistic  from  wi  <  w2  <  . . .  <  W&1  will  be  used  to  estimate  the 
following  percentiles: 

X«1  =  X.l  =  W6 

X«2  =  X.5  =  W26 

Xa3  =  x.9  =  W46 

where  aj  i  =  1,2,3,  is  obtained  using  Eq.  8.3.  From  Table  B-2  in 
Appendix  B: 
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-1.28 


z.l  = 

‘.5  *  0 
z  9  =  1.28 

From  Eq.  8.4: 


The  advantage  of  selecting  otj  =1  -  otg  and  =  .5  should  be  noted. 
The  percent! 'os  chosen  are,  of  course,  rather  arbitrary  and,  therefore, 
many  estm  utes  could  be  obtained  for  c,  y  and  tj .  In  this  case,  comparisons 
of  ...  r.  native  goodness-oMit  for  each  selection  may  be  appropriate. 

SB(c,X  known) 


This  case  implies  both  end  points  of  the  distribution  are  known.  Using 
the  percentile  method,  estimators  for  y  and  r\  may  be  obtained: 


n  = 


In 


z  -  z 
“2  “l 


c  +  A  -  x 


x.  -  c\/c  +  A  -  x. 


(8.5) 


M  A 

r  *  *  -n  in 


X  -c 
a2 


c  +  A  -  x 
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Sg  (General  Case) 

This  case  implies  that  none  of  the  parameters  are  known  and  requires 
that  the  appropriate  number  of  equations  of  the  form 


be  solved  for  the  unknown  parameters.  Generally,  this  will  lead  to  tran¬ 
scendental  equations  which  can  be  solved  numerically.  There  is  one  simpli¬ 
fication  in  the  case  where  c  is  known  and  the  percentiles  are  selected  such 
thai  a  =0^  =1  -  and  ag  =  *  5  (only  three  eauations  of  the  type 

A 

required  for  this  case).  The  solution  for  X  for  this  case  is 


Equation  8. 5  may  then  be  used  to  generate  estimates  for  rj  and  7 
since  with  8. 6  the  problem  reduces  to  one  with  both  end  points  known. 

Sg  (General  Case) 

For  general  case  of  the  SIT  system,  Johnson  has  generated  tables 

/o2) 

that  are  useful  for  determining  the  parameters.  These  are  presented 
in  Tables  B-3  and  B-4  of  Appendix  B.  The  tables  were  developed  from  solu¬ 
tions  of  equations  defining  the  relationships  of  the  first  four  moments  to  the 

t 

parameters. 

Use  of  the  tables  first  requires  that  the  mean,  variance,  skewness 
and  kurtosis  be  calculated.  The  values  for  and  02  are  then  used  to 
obtain  the  estimates  for  y  and  rj  from  Tables  B-3  and  B-4,  respectively. 
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The  A  and ,  ?  estimators  are  calculated  using: 


where  s  is  the  sample  standard  deviation  (see  Section  8. 1) 

*  -  .»A*  . 

To  illustrate  use  of  the  tables,  assume  a  random  sample  gave 
=  .  5  and  0^  =  6  .  From  Tables  B-3,  B-4 


y  =  -  .3278  and  rj  =  1.672 

A  and  c  may  now  be  calculated  directly  from  Eqs.  8. 7  and  8. 8. 

8. 2. 2. 3  Pearson  Distributions 

There  are  twelve  Pearson  distributions.  These  are  generally  indicated 
by  Roman  numerals:  Type  I  through  Type  XII.  The  problem  of  estimating 
Pearson  parameters,  like  those  of  the  Johnson,  becomes  a  two-step  problem. 
First  determine  which  Pearson  Type  to  use,  then  estimate  the  appropriate 
parameters. 

To  determine  which  Pearson  distributions  to  use,  the  skewness,  0j, 
and  kurtosis,  02,  °*  ttie  sample  data  (see  Section  8. 1)  are  needed.  The 
sample  point  (0^,  02)  should  be  plotted  on  Fig.  8. 4.  The  location  of  the 
sample  point  indicates  what  distribution  to  use.  A  warning  needs  to  be  given 
on  using  this  procedure.  The  point  (0^,  02)  calculated  from  the  data  as  in 
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Skewness  /S^ 


Fig.  8. 4.  Selection  of  Pearson  Type  from  Skewness  and  Kurtosis 
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Section  8. 1  is  only  an  estimate  of  the  true  values.  Thus  if  the  sample  point 
falls  near  a  line  separating  two  regions  in  Fig.  8. 4,  the  Pearson  Type  in 
either  region  or  in  the  line  may  fit  the  data.  In  this  event  more  than  one 
Pearson  Type  should  be  tried.  It  should  also  be  clear  from  examination  of 
Fig.  8.4  that  only  Types  I,  IV,  and  VI  are  indicated  by  regions;  therefore, 
in  practice  only  these  types  will  be  indicated  by  strict  application  of  this 
selection  procedure. 

Selection  of  a  Pearson  Type  may  also  be  aided  by  examining  the 
Remarks  column  of  Table  8. 1.  This  table  lists  all  twelve  Pearson  Types  and 
some  information  on  each.  The  form  of  the  density  function  should  be  ob¬ 
tained  from  Table  8. 1. 

The  parameters  for  the  density  functions  are  given  below. 

Typ*1 

« --H tf  KF-G- -3)  • 

Calculate  the  quantities 

r  =  6(02  -  0j  -  l)/(6  +  30j  -  2fl2) 

t  =  J  s[0j(r  +  2)2  +  16(r  ♦  1)]1/2 

m.  and  m0  are  given  by: 

.  1 

i  r 

m  =  k  r  -  2  ±  r(r  +  2)  - * - 

Z  [^(r  +  2)Z  +  16(r  +  1) 

If  jig  is  positive,  take  m^  to  be  the  positive  root 
aj  =  t/(m2/m1  +  1) 
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TABLE  8. 1 

Summary  of  Pearson  Distributions 


I 
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a2  =  t/{mx/xa2  +  1) 


m.  m0 

nij  1  m2  Ttnij  +  m2  +  2) 

yo  ~  m  +  m9  *  Hm,  +  ijMm,  +  1) 

(m1  +  m2)  12  1  2 

Type  n 


-*(-$) 

The  function  parameters  are  found  as  follows: 


5j82"9 

m  =  Jirrjp 


V  -  _L  .  r(m  +  1.5) 

0  ajv  rtm  +  1) 

Type  m 

f(x)  =  yQ  (1  +  x/a)ya  e“yX 


The  function  parameters  are  given  by: 


p  =  >a 


-  1 
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a  = 


2si  a* 

*  's2 

i  pp+i 

a  epr(p+l) 


Type  IV 


i(x)  =  y. 


HF 


—x  I  exp(-«/tan  x/a) 


The  function  parameters  are  given  by: 

y  =  6(^  -  -  l)/(2^2  -  30j  -6) 

m  =  ±(y  +  2) 

„  =  .y  (y-  2)v'J5"tl6(y- 1)  -  Pj(r-  2)2]’1/2 


a  =[f^(16(y-l)-^(y-2)2)f/2 
yo  =  l/[aF(y,  «/>] 

where  F(y,  v)  is  given  in  Reference  42. 
TflgeV 

f(x)  =  yQx  p  exp(-y/x)  (x  >  0) 

The  function  parameters  are  given  by: 


P 


8+4 
4  + - 


y  =  s(p  -  2)  V(p-3) 

y0  =  >P"1/T( p  -  1) 
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Type  VI 


q2  ql 

f(x)  =  yQ(x  -  a)  2  x  1 

The  function  parameters  are  given  by: 

y  =  «(^  -  4,  -l)/(8  +  30,  -2/32) 

a  [82(8j(y  +  2)2  +  16(y+l))]1/2 
:■  and  -q^  are  given  by: 


q  =  ^  +  2)2  +  16(y  +  DU172 


‘'i'V1 


y.  = 


r(qj) 

r^l  ‘  Q2 r  ^q2  +  ^ 


Type  VH 


m  =  y. 


Bf 


The  function  parameters  are  given  by: 

m  9 
m  =  2(^35 


__1 _ r(m) 

o  r[m  -0.5] 
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Type  VIII 


f(x)  =  yQ  ( 1  +  x/a)  m 

The  function  parameters  are  given  by: 

a  =  ±  s  (2  -  m)  J( 3  -  m)/(l  -  m) 
yQ  =  (1  -  m)/a 

where  m  is  the  solution  of 

m3(4  -  0j)  +  m2(9^  -  12)  -  24/^m  +  16/^  =0  0  <  m  <  1 

Type  DC 

f(x)  =  yQ(l  +  x/a)m 

The  function  parameters  are  given  by: 

a  =  ±s(m  +  2)  j(m  +  3)(m  +  1) 
yQ  =  (m  +  l)/a 

where  m  is  the  solution  of 

m3(0j  -  4)  +  m2(90j  -  12)  +  24m/^  +  160j  =0  m  >0 

TypeX 

f(x)  =  yQexp(-x/s) 

The  parameter  is  given  by: 

y  =  s 
Jo 
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Type  XI 


f(x)  =  yox’m 

The  function  parameters  are  given  by: 
yQ  =  b  (m-1) 

b  =  ±s(m  -  2)  7(m  =  3)/(m  -  1) 
where  m  is  a  solution  of 

m3(4  -  0j)  +  m2^  -  12)  -  24/3^  +  16^  =  0 

Type  XH 


f(x) 


s  (*/3  +  +  J  0^) 

s  (<^  +  -  «/ ^j) 


yQ  is  given  by 

yQ  =  r(m  +  l)Itl  -  m)/b 


where 


m  =  J^/i  3  +  0j) 
b  =  2s  7(3  +  0j) 
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9.  GOODNESS-OF-FIT  TESTS 


Goodness-aMit  tests  are  statistical  tests  for  evaluating  whether  a 
group  at  data  supports  the  assumption  that  the  random  variable  from  which 
the  data  are  drawn  has  come  from  the  assumed  probability  distribution. 

These  tests  are  helpful  in  accepting  or  rejecting  the  conclusion  that  some 
random  variable  follows  a  tentatively  selected  probability  distribution. 

The  technique  of  applying  statistical  tests  of  distributional  assump¬ 
tions  follows  three  basic  steps: 

1.  A  number  known  as  a  test  statistic  is  calculated  from 
the  obsciwed  data. 

2.  The  probability  of  obtaining  the  calculated  test  statistic, 
assuming  the  selected  distribution  is  correct,  is  deter¬ 
mined.  This  can  often  be  done  by  using  precomputed  tables 
of  percentiles  of  the  distribution  of  the  test  statistic. 

3.  If  the  probability  of  obtaining  the  calculated  test  statistic  is 
low,  the  conclusion  is  that  the  assumed  distribution  does  not 
provide  an  adequate  representation.  If  the  probability 
associated  with  the  test  statistic  is  not  low,  then  the  data 
provide  no  evidence  that  the  assumed  distribution  is 
inadequate. 

It  should  be  clearly  understood  that  although  this  procedure  allows 
rejection  of  a  distribution  as  inadequate,  it  never  proves  that  the  model  is 
correct.  In  fact,  the  outcome  of  a  statistical  test  depends  highly  upon  the 
amount  of  data  available  -  the  more  data  there  are,  the  better  are  the  chances 
of  rejecting  an  inadequate  model.  If  too  few  data  points  are  available,  even 
a  model  that  deviates  grossly  from  the  assumed  model  frequently  cannot  be 
established  as  inadequate. 
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Table  9. 1  provides  summary  information  on  goodness -of -fit 
tests  and  also  indicates  on  which  distributions  the  tests  are  applicable. 

After  a  test  is  selected  from  this  table,  instructions  on  how  to  perform 
the  test  can  be  found  in  the  subsequent  sections. 

A  comment  on  using  goodness -of -fit  tests  on  the  complex  distri¬ 
butions  (Weibull,  Johnson,  and  Pearson)  may  also  be  helpful.  These  dis¬ 
tributions  are  designed  to  fit  almost  any  set  of  data  well.  It  is,  therefore, 
unlikely  that  any  of  them  will  be  rejected  by  a  goodness -of -fit  test.  Using 
goodness  of  fit  tests  on  any  of  these  distributions  will  not  generally  give 
the  analyst  much  further  information  on  the  form  of  the  true  distribution, 
and  he  may  elect  to  accept  one  of  these  complex  distributions  without  a 
goodness -of -fit  test. 

This  brief  background  should  suffice  for  practical  use  of  goodness - 
of-fit  test  in  simulation  modeling.  In  the  following  section,  a  simple  selec¬ 
tion  procedure  is  given  to  determine  what  goodness -of -fit  test  to  use  based 
upon  the  probability  distribution  tentatively  selected  to  model  the  random 
variable  in  question.  In  the  following  sections  these  tests  are  described 
and  instructions  for  performing  the  tests  are  given.  Although  there  are 
numerous  statistical  tests,  these  are  the  most  powerful  available. 

9. 1  CHI-SQUARE  GOODNESS -OF -FIT  TEST 

The  Chi-square  goodness -of-fit  test  is  probably  the  most  widely  used 
and  versatile  technique  for  evaluating  distributional  assumptions.  It  can  be 
applied  to  test  any  distributional  assumption  without  having  to  know  the  values 
of  the  distribution  parameters.  Its  major  drawbacks  are  its  lack  of  sensitivity 
in  detecting  inadequate  models  when  few  observations  are  available  and  the 
frequent  need  to  arrange  the  data  within  arbitrary  cells  which  can  affect  the 
outcome  of  the  test. 
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Goodness -of -Fit  Tests 


8  C 

«se 

8 

h)  ?• 

M 

•M 

* 

8 

• 

5 

HI 

* 

• 

«rT 

M  5 
*  < 

GENERAL  COMMENTS 

A  general  and  powerful  statis¬ 
tical  test  that  is  widely  used. 
However,  it  is  not  a  good  test 
for  small  samples. 

iUi 

■ 

Jjif 

Siii 

A  test  more  powerful  than  the 

X2  for  testing  the  normal 
distribution  assumption. 

,A  test  more  powerful  than  the 

X  2  for  testing  the  exponential 
distribution  assumption. 

Attest  more  powerful  than  the 
v  for  testing  the  exponential 
distribution  assumption 

s- 

ii  ! 

l! 

ill 

ja 

t 

1 

i 

S? 

c- 

L 

n 

ii. 

ill 

4  <=■ 

S 

H 

Si 

11 

eC 

s  2. 

•  p  z 
€>  *  Z 

ST# 

h  s  a 

ill 

HI 

ill 

ftf 

8 

r.jij 

l-la 

s 

9 

O 

i 

9 

O 

«# 

3 

3 

k 

Cr 

illli 

rill 

to  n  y  o  c 

mj  n 

H  2  US  £ 

it 

Sllff 

3 

K 

1 

3 

2 

K 

3 

£> 

1 

3 

2 

1 

1 

a 

2 

>* 

H 

1 

.  «  0.7J  °  i  J 
<*  s  p  “  C  .2  ts 

5  o  £>  J  „«■ 

2cS511s? 

%  £ 

5  «  o  .  . 

JZ  i 

Ot  e 
p 

5  c  - 

jg  s 

= « g  * 

.lh 

3 

3 

3 

j 

A. 

0. 

< 

Test  to  evaluate 
sample  for  any 
tributional  assu 
tion  for  any  type 
distribution.  A 
parametric  or  d 
tnbution  free  te 

eSnnubbV 
«  &.**  8 
°  * E~ 

5tjCrt^*««2o 

«  s  i  >  ii  8  a '■s 

o  E  J  «*  e>  2  1  B 

-IS  fs-5  §  £ 
>fc1i§ 8*3 

Huts  »>  o  o<-o 

Test  to  evaluate 
assumption  that 
sample  comes  ft 
a  normal  or  log¬ 
normal  distribu¬ 
tion. _ 

Test  to  evaluate 
assumption  that 
comes  from  an 
Ual  distribution 
origin  unknown. 

fSaJ, 

Hill 

sf 

i  &  -z  s 

Z  5  8*S 

APPROPRIATE 

test 

SUBSECTION) 

S  H 

a  £* 

e* 

Md"  -  test 

Kolmogorov- 
Smimov  test 

Kolmogorov 

test 

(9-2) 

•8  - 

T  s 

"WE"  -  test 

0.4) 

++ 

fltt 

if  ~ 

.  •. 

? 

_ i 

M  V 

-1 

3P 

a  a 

5  « 

(§  s 

2a 

Any 

»i 
■<  2 

Normal 
Log  Norma 
Johnson 
(see  Table 

Exponentia 

(origin 

unknown) 

Exponentia 
(origin  knc 

a  a 

— -  - 

83 


The  Chi-square  test  is  used  as  follows: 

Step  1.  Estimate  each  of  the  unknown  parameters  of  the  assumed  distribution. 

Step  2.  Divide  the  data  into  k  classes  or  cells  and  determine  the  probability 
of  a  random  value  from  the  assumed  model  falling  within  each  class.  Two 
methods  for  this  are  presented:  the  first  method  is  applicable  if  the  data  are 
initially  arranged  in  frequency  classes,  and  the  second  applies  when  the  data 
are  not  initially  tabulated  in  classes. 

Method  a.  The  number  of  cells,  k,  will  be  the  number  of  classes  of 
the  tabulated  data  subject  to  the  restriction  that  the  expected  number 
of  observations  in  each  cell  under  the  assumed  model  is  at  least  5. 

Let  CLj  and  CU.  denote  the  lower  and  upper  bounds  of  the  ith  fre¬ 
quency  cell.  The  distribution  of  the  assumed  model  (using  the  esti¬ 
mated  parameters)  is  then  used  to  estimate: 

Pr(CL.  -  x  <  CU.)  ,  i  =  1,2, ...k  . 

Method  b.  When  the  number  of  observations,  n,  is  large  (>200)  a 
good  rule  is  to  take  k  as  the  integer  closest  to 

k'  =  4[0.75(n-n2]1/5  . 

For  moderate  values  of  n  a  good  rule  is  to  make  k  as  large  as 
possible  but  with  the  restriction  k  <  n/5.  The  cell  boundaries 
xj,..2»  •  •  «Xo  are  determined  from  the  cumulative  distribution  for 
the  assumed  model  (using  the  estimated  parameters)  as  the  values 
such  that: 


Pr(x  ^  Xj)  =  £,Pr(x 


x2)  =  •  •  Pr(x  * 


v  )  =  1K_1 

*k-r  k 


l) 


Step  3.  Multiply  each  of  the  cell  probabilities  by  the  sample  size  n.  This 
yields  the  expected  number  E:  of  observations  for  each  cell  under  the 
assumed  model.  For  Method  zb: 

Ej  =  n/k  ,  i  =  1,2, . . .k 

Step  4.  If  the  data  are  not  initially  tabulated,  count  the  number  of  observed 
values,  m.,  in  each  cell.  Otherwise,  determine  m.  directly. 
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Step  5.  Compute  the  test  statistic 


2 

X 


E. 

1 


For  Method  2b  this  simplifies  to 


o 

Step  6.  Compare  the  computed  value  x  with  the  tabulated  percentiles  for 
a  chi-square  variate  (Table  B -5)  using  k-r-1  degrees  of  freedom,  where  r 
iS-the  number  of  parameters  that  were  estimated  in  Step  1.  High  values  of 
X'  signify  that  the  observed  data  contradicts  the  assumed  model.  For 
example,  if  the  above  calculated  value  x  1  exceeds  the  0.  95  tabulated  value 
of  Chi-square,  the  chances  are  less  than  one  in  twenty  that  the  data  could 
have  come  from  the  assumed  distribution. 

9 . 2  KOLMOGOROV  -SMIRNOV  TEST 


This  test  is  used  to  evalute  the  assumption  that  a  sample  belongs  to  a 
specified  known  continuous  distribution.  It  is  a  distribution-free  test  and  is 
a  good  test  for  small  samples.  In  general,  it  is  a  more  powerful  test  than 
the  Chi-square  where  it  is  applicable.  Although  the  test  is  designed  for  com¬ 
paring  a  sample  against  a  specified  and  known  distribution,  the  test  is  robust 
enough  that  it  may  still  be  applied  to  distributions  whose  parameters  are 
estimated  from  the  sample  data.  The  effect  of  estimating  the  parameters  of 
the  distribution  from  the  sample  is  to  reduce  the  critical  level  of  the  dft(N) 
statistic,  i.  e. ,  the  level  of  significance  is  really  higher  than  the  a  associated 

with  the  chosen  d  (N).  Hence,  if  the  chosen  d  (N)  statistic  value  is 
o t  ft 
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exceeded  in  the  test,  it  can  be  safely  concluded  that  the  discrepancy  is 
significant.  Grouping  observations  into  intervals  also  tends  to  lower  the 
value  of  d.  For  grouped  data,  therefore,  the  appropriate  significance 
levels  for  testing  should  be  chosen  smaller  than  the  significance  levels  used 
for  a  complete  sample. 

The  test  is  used  as  follows: 

Step  1.  Rearrange  the  sample  of  size  n  to  obtain  the  ordered  sample 

x. , x„, . . . x  where  x.  £  xn  ^ . . . .  £  x  . 
l  i  n  l  z  n 

Step  2.  The  sample  cumulative  distribution  then  takes  on  values  of 
1/rt,  2/n, . . . . ,  n/n  at  the  points  x., . x^. 

Step  3.  Calculate  the  cumulative  frequency  values  for  the  assumed  distri¬ 
bution  at  the  sample  values  of  xt ,  x0,  ...  x  . 

i  z  n 

Step  4.  Determine  the  maximum  deviation,  d,  between  the  sample  cumula¬ 
tive  distribution  and  assumed  cumulative  distribution  from  Steps  2  and  3. 

Step  5.  Compare  the  calculated  deviation  d  with  the  test  statistic  d  (n' 
found  from  Table  B-6  for  the  desired  level  of  significance.  If  d  exceeds 
the  value  d  (n )  then  the  assumption  that  the  sample  comes  from  the 
assumed  distribution  may  be  rejected  at  the  lOOoro  significance  level. 

9.  3  W-TEST 

This  test  is  used  to  evalute  the  assumption  that  a  sample  has  a  normal 
distribution.  It  can  be  used  to  test  the  assumption  that  a  sample  fits  log¬ 
normal  distribution  by  using  the  log  of  the  sample  values.  The  W-test  has 
been  shown  to  be  an  effective  technique  for  evaluating  the  assumption  of 
normality  against  a  wide  spectrum  of  non-normal  alternatives,  even  if  only 

a  relatively  small  number  of  observations  are;  available.  It  is  generally 

2 

more  powerful  than  the  x  *  especially  for  small  sample  sizes. 

The  W-test  is  used  as  follows: 

Step  1.  Rearrange  the  sample  to  obtain  the  ordered  sample  x.  ,x„, . . .  ,x  , 

where  x.  &x4  &...x  .  '  11 

12  n 
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Step  2.  Compute 


where  x  is  the  data  mean. 

Step  3.  If  n  is  even,  set  k  =  n/2;  if  n  is  odd,  set  k  =  (n-l)/2.  Then 
compute. 

k 

b  =  7  a  ,  .(x  .  ,  -  x.)  , 

l-j  n-i+1  n-i+1  i  ’ 

i=l 

where  the  values  of  a^j^i  for  i  =  l,....k  are  given  in  Table  B-7  for 
n  =  3, . . . ,  50.  Note  that  when  n  is  odd  x  .  does  not  enter  into  this 
computation.  + 

Setp  4.  Compute  the  test  statistic 
W  =  b2/S2  . 

Step  5.  Compare  the  calculated  value  of  W  with  the  percentiles  of  the  dis¬ 
tribution  of  this  test  statistic  shown  in  Table  B-8.  This  table  gives  the  mini¬ 
mum  values  of  W  that  we  would  obtain  with  1,  2, 5, 10,  and  50  percent  proba¬ 
bility  as  a  function  of  n,  if  the  data  actually  came  from  a  normal  distribution 
If  the  percentile  is  lower  than  the  selected  level  of  significance,  then  the 
hypothesis  of  normality  can  be  rejected  and  accepted  otherwise. 

9.4  WE -TEST 

This  test  is  used  to  evaluate  the  assumption  that  a  sample  has  an  ex¬ 
ponential  distribution  with  the  origin  unknown.  Percentiles  of  the  WE  dis¬ 
tribution  have  not  yet  been  tabulated  for  sample  sizes  other  than  7  to  35. 

The  comments  on  the  W  -test  are  also  applicable  here. 
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The  WE-test  is  used  as  follows: 


Step  1.  Calculate  the  test  statistic: 


where  x^,  i--l,...  n,  are  the  n  observed  values,  x  is  the  smallest 
value,  and  x  is  the  data  mean. 

Step  2.  Compare  the  computed  value  WE  with  the  95  percent  and  90  percent 
ranges  given  in  Table  B-9.  This  test  is  two-sided  in  that  too-low  or  too- 
high  values  indicate  non-exponentiality.  Thus,  if  the  computed  WE  value 
falls  outside  the  95  or  90  percent  range,  then  the  chances  are  less  than 
one  in-  20  or  one  in  10,  respectively,  that  the  observed  sample  was  drawn 
from  an  exponential  distribution. 

9.5  WE0-TEST 

This  test  is  used  to  evaluate  the  assumption  that  a  sample  has  an  expo¬ 
nential  distribution  with  the  origin  c  known.  However,  percentiles  of  the  distri¬ 
bution  WEq  have  not  been  tabulated  for  sample  sizes  other  than  7  to  35.  The 
comments  on  the  W-test  are  also  applicable  here. 

The  WE  -test  is  used  as  follows: 
o 

Step  1.  Subtract  the  known  location  e  from  each  of  the  sample  values  x.. 
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where  x.,  1  =  1, . . . ,  n,  are  the  n  sample  values  and  x  is  the  sample 
mean.  1 

Step  3.  Determine  whether  the  computed  value  WE  lies  outside  the  tabulated 
dh  percent  and  90  percent  ranges  shown  in  Table  B-IO  as  a  function  of  n.  This 
test  is  two-sided  in  that  too-low  or  too-high  values  indicate  non-exponentiality. 
Thus,  if  the  computed  value  of  WEQ  falls  outside  the  95  percent  range,  the 
chances  are  less  than  one  in  twenty  that  the  observed  sample  was  drawn  from 
an  exponential  distribution  with  the  assumed  origin. 
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COMPLEX  PARAMETRIC  DISTRIBUTIONS 

\ 
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A.l  INTRODUCTION 


Although  the  reader  probably  has  a  good  general  knowledge 
of  the  simple  parametric  distributions,  he  is  likely  to  be  unfamiliar 
with  the  complex  parametric  distributions.  The  main  text  of  this 
volume  indicates  when  and  how  to  use  these  distributions,  but  all  with¬ 
out  requiring  a  thorough  understanding  of  the  complex  distributions. 
This  appendix  is  intended  to  give  the  reader  some  background  informa¬ 
tion  on  the  complex  distributions  so  that  he  will  be  better  able  to  under¬ 
stand  and  use  the  related  material  in  the  main  text. 
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A.  2  WEIBULL  DISTRIBUTION 


The  Weibull  distribution  is  best  known  for  its  application  to  reliability 
analysis  where  it  is  known  to  fit  a  large  class  of  life  (time  to  failure)  dis¬ 
tributions  (53).  The  basic  distribution  suggested  by  Weibull  is  to  define 
<p(x),  where  the  cumulative  distribution  function  F  is  given  by 

F(x)  =  P[X<  x]  =  l-e“^x)  =  f(x)dx  . 

—00 


One  of  the  simplest  forms  for  <p(x)  is 


»<*)  .  ^ 

X  2  c 

=  0 

X<  € 

in  which  case 

.  (.v 

F(x)  =  1  -  e  X 

X2f 

=  0 

X  <  € 

and 

.  (X-C)” 

f(x)  =  t)A  (x-t)’’ e  X 

Xi( 

•x 

=  0  X  <  € 

The  parameter  c  is  called  the  location  parameter  in  the  sense  that  it 
defines  the  lower  limit  for  the  random  variable  x.  For  the  special  case 
where  c  =  0, 

_  «  _r?A 

f(x)  =  r\/\  x77”1  e  x 


Preceding  pege  blank 
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and 


-xVx 

F(x)  =  1-e  x  /X  . 

The  values  of  r\  and  X  may  be  selected  to  provide  a  large  number 
of  shapes  some  of  which  are  sketched  below  in  Fig.  A.  1.  For  this  reason 
rj  is  called  a  shape  parameter  and  X  is  called  a  scale  parameter  since  it 
scales  the  value  of  x. 


Fig.  A.  1.  Weibull  Distribution  for  Various  Values  of 
Parameter  rj 
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It  should  be  noted  from  Fig.  A.  1  that  rj^  ^  1/2  might  represent 
the  shape  parameter  for  the  early  failure  region  and  tj  ^  =  3  the  shape 
parameter  for  the  wear-out  region  in  a  typical  reliability  application. 
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A.S  JOHNSON  DISTRIBUTIONS 


These  distributions  were  proposed  by  Johnson  who  used  trans¬ 
formations  of  the  normalized  normal  random  variable  to  generate 
empirical  distributions^  1 , 22 ).  The  main  advantages  of  this  approach 
are  that  percentiles  of  the  empirical  distribution  may  be  obtained  using 
a  table  of  the  normal  probability  distribution,  as  will  become  apparent 
later,  and  that  the  approach  encompasses  a  broad  class  of  problems. 

To  introduce  the  Johnson  distributions,  assume  that  it  is  de¬ 
sired  to  obtain  a  probability  density  function  for  the  random  variable 
X  about  which  little  or  no  information  is  available.  Then,  a  general 
transformation  from  X  to  Z  is  postulated,  where  Z  is  a  normalized 
normal  random  variable,  as  follows 

Z  .  y  +  t|T(X)  , 

where  yand  tj  and  parameters  to  be  determined. 

m 

In  most  situations,  the  transformation  T(X)  will  be  unknown. 
However,  Johnson  proposed  three  families  of  distributions,  referred 
to  as  the  S^,  Sg,  and  systems,  respectively,  defined  as  follows 

SL( Log-normal JT(x)  =  In  (x-e)  ;  x  >  c 

Sg(Bounded)  T(x)  =  ;  c  <  x  <  c  +  X 

^(Unbounded)  T(x)  =  stnh"1^^)  ;  -«<x<* 

The  undefined  regions  for  x  above  imply  T(x)  *  0. 


Preceding  page  link 


Similar  to  the  Weibull  distribution,  rj  and  y  are  shape  parameters, 
X  is  a  scale  parameter,  and  c  plays  the  role  of  location  parameter 
which  shifts  the  region  of  relevancy  for  x .  These  parameters  are 
subject  to  the  following  constraints: 

fl> 0  ;  X  >0 

-•  <  y  <  • 

-»<(<• 


In  some  cases,  these  parameters  may  be  identified  from  a 
basic  understanding  of  the  process.  For  example,  if  the  random 
variable  x  must  be  non-negative,  then  c  =  0  and  the  S^,  or  lognormal 
distribution,  might  be  appropriate.  If  x  is  restricted  to  a  finite  region, 
<<  x  <  c  +  X ,  then  Sfi  (bounded  distribution)  may  be  appropriate. 

An  infinite  range  for  x  would  suggest  the  Su  (unbounded)  distribution. 

The  probability  density  function  for  the  distributions  are  as 
follows: 


SL:  fl(x) 

SB: 

SU:  Vx^ 


-^-_exp  {-£[*♦«<*-«>]*}  ;  **, 

i  (x-o  exp  ff[y+,,wfe)] ) 


x^c+X 


JL 

& 


exp 


-»<x  <  • 
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The  density  function  for  the  8^  system  is  a  three -parameter 
distribution  commonly  called  die  log-normal  distribution.  This  is 
known  to  describe  many  familiar  events  such  as  amount  of  inheritance, 
income,  particle  size  from  breakage,  etc. 

As  previously  mentioned,  the  class  of  situations  encompassed 
within  these  distributions  is  large.  An  indication  of  the  flexibility 
in  defining  a  large  number  of  shapes  is  evidenced  in  Figs.  A.  2  to 
A. 4  which  illustrate  several  forms  of  the  SL,  S0  and  Sy.  density  func¬ 
tions. 

The  difference  between  the  three  types  of  Johnson  distributions 
can  be  characterized  by  the  relationship  between  the  distribution  skew¬ 
ness  and  distribution  kurtosis.  Section  8  of  this  volume  contains  a  dis¬ 
cussion  of  skewness  and  kurtosis;  however,  a  summary  definition  is 
that 

3 

01=43/s  (skewness) 
and 

s2  *  H*/®4  (kurtosis) 

To  help  in  the  definition  of  the  relative  variation  in  0^  and  02> 
Johnson  prepared  the  results  as  shown  in  Fig.  A.  5.  Note  that  the  log¬ 
normal  distribution  is  defined  by  a  line  given  by: 

=  (w-1)  (w+2)^  ;  >  0 

430 

02  =  w  +  2w  +  3d)  -3  ;  >  0 

where 

»  =  e1/”2 


is  the  shape  parameter  for  SL 
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Fig.  A.  3.  Johnson  Probability  Density  Functions  for  Sfi  (c  =  0  ;  A-  1) 
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Fig.  A.  5..  Regions  of  Definition  for  Johnson  Distributions  Based  on 
Skewness  and  Kurtosis 
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It  should  be  recognised  that  estimates  for  fi  ^  and  /§2  may  lead 
to  a  wrong  conclusion  as  to  the  type  of  distribution  to  be  used.  The 
confidence  that  this  will  not  occur  is  related  to  the  accuracy  of  the 
estimates.  In  case  of  doubt,  a  goodness -of -fit  test  may  be  used  to 
help  in  a  decision. 
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A.  4  PEARSON  DISTRIBUTIONS 


A  general  class  of  probability  density  functions  known  as  the 
Pearson  family^®  *  ^  ,  is  given  by  solutions  of  the  differential 
equation: 


(*  +  »>r 

+  bjX  +  b2 


The  solutions  of  this  equation  were  classified  by  Pearson  into  twelve 
families  of  curves  shown  in  Table  8. 1.  These  curves  are  displayed 
in  Fig.  A.  6.  The  Pearson  distributions  are  related  to  the  standard 
densities  frequently  discussed.  For  example:  the  gamma  distributions 
are  Pearson's  Type  m  curves,  the  normal  is  a  Type  VII,  the  beta 
is  a  Type  I  while  the  beta  with  parameters  a- 8  is  represented  by 
the  Pearson  Type  n  curves. 

This  system  of  density  functions  is  very  appealing  from  the 
standpoint  of  fitting  sample  data,  the  reason  being  that  only  the  first 
four  moments  need  be  calculated.  Pearson's  methods  of  fitting  sample 
data  consists  of  the  following  steps: 

1.  Compute  the  first  four  moments,  u2,  43* 

U4  of  the  sample  data. 

2.  Calculate  the  numerical  value  of  the  parameters 

and02,  where: 

=  skewness, 

0  2  =  kurtosis. 


TYPE  I 


TYPE  VI 


q2  ql 

*W  =  y0  fc-a)  Zx 


Fig.  A. 6.  Typical  shapes  of  Pearson  distributions  (Sheet  1  of  2) 


Note:  Type  IV  and  VII  appear  as  normal  distributions; 

Fig.  A. 6.  Typical  shapes  of  Pearson  distributions  (Sheet  2  of  2) 
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Skewness  0^ 


Fig.  A.  7.  Types  of  Skew  Frequency  for  Values  of  and  fL 
for  the  Pearson  System  1  1 
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These  parameters  determine  the  type  of 
Pearson  distribution  which  appropriately 
matches  the  sample  data. 

3.  Equate  the  observed  (sample)  moments  to 
the  moments  of  the  appropriate  distribution 
expressed  in  terms  of  its  parameters,  and 

4.  Solve  the  resulting  equations  for  those  parame¬ 
ters  thereby  completely  specifying  the  distribution 
function. 

The  relationships  between  0^  and  02  for  a  given  Pearson  distribution  have 
been  represented  in  a  convenient  graphical  form  in  the  so-called  0j,  02* 
plane  shown  in  Fig.  A.  7.  The  normal  distribution  corresponds  to  the  point 
/?!  =  0,  #2  =  3  in  the  0^0 2  plane.  Type  HI  distributions  are  to  be  chosen 
when  the  point  01,02  is  on  the  line  202-30j  -6  =  0  and  Type  Vwhen  (0j,02) 
is  on  the  cubic 

flj'82  +  3)2  =  4(4S2  -  3«j)  (2fc2  -  SSj  -  8). 

In  considering  the  subtypes  under  Type  I,  a  biquadratic  in  0j  and  02 
separates  the  area  of  the  J -shaped  curves  from  the  regions  of  limited 
range  modal  curves  and  the  region  of  the  U-shaped  curves. 

In  summary,  the  curves  traced  in  the  F’^ane  provide 

a  means  of  selecting  the  Pearson  distribution  appropriate  to  a  given  collec¬ 
tion  of  sample  data.  For  further  details  and  numerical  examples 
see  Elderton(lO)  and  Kendall(27). 
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TABLE  B-l 

UnbUslng  Factors  for  ths  11.  L.  E.  of  r? 


A 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

B(n) 

.  669 

.752 

.792 

.820 

.842 

.859 

.872 

.883 

.893 

.901 

.908 

.914 

ri 

18 

20 

22 

24 

26 

28 

30 

32 

34 

36 

38 

40 

»<n) 

.923 

.931 

.938 

.943 

.947 

.951 

.955 

.958 

■  960 

.962 

.964 

.966 

n  42  44  44 

B(n)  .90S  .970  .  S7X 

n  66  68  70 

B(n)  .910  .961  .98] 


48 

50 

52 

54 

56 

58 

60 

62 

64 

972 

.973 

.974 

.975 

.975 

.977 

.97  8 

.979 

.980 

72 

74 

76 

78 

8R 

65 

90 

100 

120 

982 

.982 

.983 

.983 

.984 

.985 

.986 

.9C7 

.990 
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ns 


TABLE  B-2 

Percentiles  of  the  Normal  Distribution 


1 

/sf 


dt 


9 

.01 

.02 

□ 

Q 

□ 

.00 

.07 

.OS 

.09 

.0 

.5040 

m 

m 

tr?' 

.5109 

.5239 

.5279 

.5319 

.5359 

.1 

.5308 

.5438 

.5478 

.5517 

.5557 

.5590 

.5030 

.5075 

.5711 

.5753 

.2 

.6793 

.5832 

.5871 

.5910 

.5948 

.5987 

Si.'i/Ti! 

.0001 

.0103 

.0111 

.3 

.0170 

.0217 

BTiyw 

.0293 

.0331 

.0308 

KM 

.6143 

.6480 

.0517 

.4 

.0551 

.0004 

.0700 

.0772 

.6811 

.0879 

.5 

ITof 

9P7JTT 

.7054 

1.  .  . 

.7123 

.7157 

.7224 

.0 

.7257 

.7291 

'.7321 

.7357 

.7389 

.7422 

.7151 

.7-ISii 

.7517 

.7519 

.7 

■FEZl 

.7042 

.7073 

,7704 

.7734 

.7701 

.7791 

.7823 

.7852 

.8 

.7881 

■viiji*. 

.7930 

■BUS 

REE 

.8023 

l.-'.lKI  1 

.8078 

■BEE 

.8133 

.9 

.8159 

.8212 

.8238 

.8204 

.8280 

.8315 

.8310 

.8305 

.8389 

1.0 

.8113 

.3438 

ww 

.8-185 

.8508 

.8531 

.8554 

.8577 

.8599 

.8021 

1.1 

HUE 

.8005 

.S'.'** 

.8708 

.8729 

.8719 

.8770 

.8790 

.8S10 

»:Wtl 

J.3 

.8819 

.8809 

.n 

.8907 

•89Z5 

.8944 

.89G2 

i  Pi  £SC 

.S997 

1.3 

.9032 

V" 

.9082| 

.9099 

PEITT? 

.9131 

.9147 

.9162 

.9177 

1.4 

.9192 

.9-22 

.0230 

.9251 

© 

.0292 

1.5 

.9332 

ETET?*! 

.9382 

■  : 

:  TE 

.0441 

1.0 

BTFhr) 

.9103 

■PiPe 

.9515 

.0525 

.9535 

.9545 

1.7 

KifrV*! 

.9504 

KMil 

.9599 

■i®, 

KTIE 

.9025 

.9033 

1.8 

K  Mfl 

Hit  i  Bi<Wrir! 

.9071 

kme 

■ES2 

.9093 

.9099 

.9700 

1.0 

.9713 

Bali 

.9732 

.9744 

mB 

.0750 

K  ri.ii 

2.0 

.9772 

.9778 

.0783 

K3E 

.9793 

.9798 

.9812 

2.1 

PlijT/Tr 

.a-il 

.9834 

■PkFsI 

.9842 

Ira 

.9857 

2.2 

mmi\ 

•9S&< 

.9871 

.9S8I 

mMW 

.9SS7 

.9800 

2.3 

..9893 

,;  ESl  IRIUJ 

.9909 

.9911 

.0913 

.9910 

2.4 

.99^ 

.9913  .9925 

.9927 

wEi M 

.9930 

2.5 

.9938 

RSf 

.9948 

RJI 

RIB 

.9952 

2.0 

.9953 

.995#- 

KliM 

.9957 

MvU 

.9900 

.9901 

Huiim! 

.9904 

2.7 

.9900 

.9907 

.9968 

mfm 

.9970 

.9971 

.9972 

.9973 

.9974 

2.8 

Mmfl 

.9977 

.9978 

.9979 

.9979 

mmv 

.9981 

2.9 

.9981 

.9944- 

.9981 

.9981 

.9985 

B. 

m  : 

8.0 

.9988 

.9989 

MS 

.9990 

3.1 

.9990 

.9901 

.9991 

■  Ml 

.9992 

.9992 

.9992 

.9992 

.9993 

.9993 

3.2 

.9903 

.9993 

.9994 

■  ACT 

mmz' 

STPICT1 

.9991 

.9995 

■EMc 

.9995 

3.3 

was 

.5995 

.9995 

■  55SS 

n 

.9990 

.0990 

■IWIH 

.0996 

.9997 

3.4 

.9997 

.9997 

.9997 

.9997 

.9997 

.9998 

9 

1.282 

1.045 

[[II 

2.320 

2.570 

3.291 

3.S91 

4.417 

m 

WBSi 

E3 

.975 

m i 

.999 

.99905 

.909095 

2?1  -  J’Wl 

.20 

.10 

.05 

|  .02 

.01  | 

|; 

.001 

.0001 

.00001 

(From  A.  M.  Mood,  Introduction  to  the  Theory  of  Statistics.  McGraw-Hill. 
1950.)  - - 
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TABLE  B-3 


Tables  to  Facilitate  Fitting  Johnson  Sv  Distribution* 
Values  of  —  y 


A 

444 

414 

J 

»IS 

430 

433 

434 

•  16 

•  46 

446 

460 

u 

0-3474 

0-7373 

1-334 

1440 

4191 

4399 

M 

-ms 

•4634 

47747 

1143 

I6M 

3-477 

H 

•1740 

•3620 

•1499 

49164 

MM 

1-669 

2  334 

3  464 

M 

•1431 

•3904 

0634 

•4364 

49430 

1-146 

1-646 

2146 

H 

0-1144 

43436 

43774 

06340 

44997 

44116 

1187 

1666 

3-139 

3-167 

H 

-1034 

•2103 

•3330 

•4467 

■1907 

•7666 

0  9661 

1  236 

3-614 

2-18* 

M 

•0414 

•1663 

•3944 

■3621 

•6138 

•6616 

•6197 

1  028 

1)02 

1-6*7 

M 

•4430 

•1661 

■3643 

•3490 

■4636 

•672) 

•7127 

0  SSI4 

1-096 

1  376 

M 

-0744 

•1607 

•3303 

•3140 

■4076 

•6113 

■6304 

•773) 

0  9470 

1-169 

4-1 

00444 

41361 

43I0S 

43476 

43707 

0-4039 

0  667) 

46903 

0  036) 

1018 

M 

0333 

■1376 

•1443 

■3647 

■3404 

•42)4 

•6H6 

■6343 

•700) 

0*03! 

41 

-0444 

•1146 

•1406 

•3466 

•till 

•39-17 

•4747 

•6708 

•6814 

•81?? 

4-4 

0443 

•1112 

•1444 

•3394 

•29)7 

•>o:t: 

■4)37 

•6266 

•6261 

•7407 

41 

0414 

•1046 

•1644 

■3163 

-3762 

•3306 

•4100 

•4891 

•6780 

•601 1 

4-4 

0-0441 

40  j '  V 

41440 

43031 

42603 

43192 

0-3644 

046*4 

0  6382 

0  6311 

4-t 

0444 

•0033 

■1431 

•1933 

■2461 

•3014 

•3*22 

•4?RN 

■501(1 

■6HK6 

44 

•0444 

•0891 

•13U 

■1338 

•3)27 

■2667 

•3424 

404b 

4744 

two 

44 

•0434 

•OH.:! 

■1290 

•1743 

■3216 

•2717 

•3264 

■1636 

•4484 

i>“yt 

H 

•0404 

0316 

•1234 

•1666 

•3117 

■2692 

■3099 

3648 

■4254 

•4(*2£ 

St 

00340 

00783 

01164 

41607 

43027 

4124*0 

0  2981 

0  3479 

04060 

0-4074 

S-» 

0374 

•0762 

•llll 

•1634 

•1940 

•r:i7K 

•2*06 

•3?2* 

■3866 

-4<63 

1) 

-0341 

07' t. 

•10M 

•1477 

•1*72 

•2K5 

•27731 

•3111 

•asm 

•4265 

s: 

-0344 

•07  if' 

■1067 

•1424 

•1804 

•27(11 

•2f  :0 

•3l-3* 

•Si  17 

•4076 

Si 

•0337 

•0671. 

•1023 

•1376 

•1742 

•2123 

•2626 

•  2962 

*144 1 1 

•3813 

M 

00334 

0066* 

0-03** 

0-1331 

01*84 

02052 

0  247* 

0  r"is 

o  r.?H 

0?.7‘T> 

5-7 

-0314 

C  . 

•095* 

•1290 

•1631 

•19hG 

•236S 

■« 

■317.-. 

•?;.;'B 

68 

•0307 

l  j;  * 

•09)0 

•ISM 

•l»“2 

•1926 

•22*4 

.-■M3 

•SCft* 

•7.1,4  | 

3-4 

o  m 

•09IH 

■1216 

•1531, 

•18f,H 

•2216 

■2661 

■2907 

44 

•0240 

08*3 

■0*70 

■1162 

•1493 

•1616 

■2161 

•760 4 

•2679 

•.it;  H 

41 

00243 

00/04 

40666 

41161 

41463 

41700 

0-2091 

42433 

0  27W4 

0  3  5  >10  1 

4-3 

•0274 

•0683 

•0*36 

•1121 

•1416 

•I7III 

■203* 

•2.380 

•2713 

•l».i' 8  1 

4-S 

•0269 

•0540 

•0*14 

•1094 

•1380 

•1670 

•19*3 

•raoi 

■26-:  .7 

4-4 

0263 

•O', 27 

•0*96 

•1067 

■1347 

•ie:ir> 

•19)3 

•12^5 

•2574 

•?<»•;  . 

4-S 

0267 

0616 

•0777 

•1043 

■1)16 

•1696 

•1687 

•2110 

■2M0 

-2b4:.  ; 

44 

0-0241 

00604 

40760 

41020 

41380 

41*00 

41643 

0-244* 

02776  | 

47 

-0246 

•04C7 

•0743 

•0P9H 

■1286 

•1A2B 

•1802 

•?o:,o 

2391 

■2701, 

44 

-0341 

04  K3 

•0738 

•0977 

1231 

■1492 

■1762 

•2043 

•2337 

•VC  4  8 

4-4 

•0238 

•0473 

•071? 

•0967 

•1200 

•1401 

•1726 

•1999 

•22*6 

•?/.(.(> 

74 

0332 

0464 

■0699 

•0938 

•1162 

■1432 

■1690 

■1957 

■2237 

•2M<> 

1 

7-1 

00337 

0  0456 

0-0666 

40220 

41159 

41404 

01666 

01918 

0*190 

02476  j 

7-3 

•0333 

0447 

•0673 

■PP03 

■11)7 

•1377 

■1624 

•  |KNO 

•2147 

•742* 

73 

•0313 

•0439 

•0661 

•0*87 

•me 

■1352 

•1694 

■1841 

•3108 

■7377 

7-4 

0216 

•0431 

•0660 

■0871 

■I09G 

■1327 

•1666 

•1610 

•209* 

•2331  1 

7-S 

•0313 

■0424 

0639 

-0886 

•1077 

•1304 

■1637 

•1777 

•2077 

•itlh;  | 

74 

0-0304 

00417 

40634 

40843 

41069 

4I2R2 

41610 

01746 

0  1991 

0-2246  ! 

7-7 

0306 

0410 

•0614 

•0626 

•1042 

•1360 

•1415 

•1716 

■lose 

2906 

7-S  * 

•0302 

•0404 

•0606 

■0816 

•1026 

•1240 

•14*0 

•ICH7 

■1922 

•2187 

7-4 

0166 

0398 

•0549 

•0802 

•1009 

■1220 

•1437 

•1660 

■1891 

•2131  | 

M 

0146 

0393 

■0690 

•0790 

•099) 

•1201 

•1414 

- 1633 

•I860 

•2(»siA  i 

S-3 

0-0140 

00)60 

40672 

00767 

40944 

41166 

41371 

41683 

0- 1802 

0-2029  ! 

S-4 

0166 

■0370 

•0667 

•0746 

•0937 

•11)3 

•13)3 

■1637 

•174*1 

•  loft**  ; 

S-4 

0140 

•0300 

•0642 

•0726 

•091? 

•HOI 

■1296 

■1494 

•1009 

•1*12  1 

M 

0176 

•0361 

■0626 

•0707 

068* 

•107) 

•12*1 

■1464 

■Il'.V* 

M 

0171 

■0343 

•0616 

0689 

•0866 

•1040 

•1229 

■1417 

•1*10 

•1810  | 

4-3  - 

MB 

•  1199 

41)82 

0  1670 

0-1764 

M 

— 

““ 

— 

•1171 

•1349 

•1632 

1721 

TAB IX  B-3  (continued) 


Values  of —y  ( continued ) 


1-41  1*45 


inc 


M  |  1-447 


4-1  I  1'Ml 
1141 
M  I  MM 
04  1-044 


■MS 

-•Tf  W) 
•M4  1-4M 

•799  MW 
-4M  1-M4 


7  C.  |  0  0430 
•0444 
•5X78 
•4117 
•6947 


mm 


•4434 

•4951 

•4714 

0-4748 

0  4154 

•4444 

•5040 

5494 

■4587 

E3 

•4343 

•4511 

KTTTM 

•4434 

-4403 

•5909 

0-4349 

0-4794 

04119 

•4309 

■TllLM 

■2™ 

•4937 

•4474 

•4175 

•4504 

•4884 

•4115 

•4434 

-4740 

0-4001 

04311 

04444 

KjiLa 

•4199 

•4413 

•3744 

•4089 

kutm 

•3703 

•3974 

•4973 

•3414 

-3881 

•41M 

0-3533 

09789 

0-4043 

-3454 

■3703 

-3454 

E41 


imm 


TABLE  B-3  (continued) 


Values  of  —  f  ( continued ) 


14* 

103 

14* 

1-7* 

171 

141 

i-ei 

141 

141 

240 

1 

M 

MN 

1411 

•■1 

>140 

0-101 

M 

1447 

>700 

M 

1414 

1410 

H 

1-001 

1-101 

« 

14*0 

>110 

•4 

1407 

14*0 

1401 

440* 

1-017 

1440 

1-MO 

>042 

04 

1-400 

1-740 

>101 

•4*1 

•4 

1-001 

1447 

1-000 

>710 

M 

1314 

1400 

1-001 

>407 

>2 

1-101 

14M 

1-7*1 

1-MO 

M 

1-114 

1420 

1-007 

>100 

M 

1-100 

1470 

1411 

1-071 

44 

1-100 

1-114 

18*0 

1-0*6 

2-201 

1470 

M 

1-111 

1-MO 

1-471 

1-7*0 

1-MO 

1-147 

44 

1-C70 

14U 

1411 

1470 

>061 

>702 

» 

1440 

1-102 

1-107 

16*2 

142* 

2-6*3 

44 

1410 

1-144 

1207 

1-820 

14*1 

1-2*6 

44 

>00111 

1-10* 

1-201 

1400 

1-7*6 

>173 

IM 

4010 

147* 

1210 

1400 

146* 

1430 

10-1 

>0074 

1440 

1-100 

1-341 

1482 

14*0 

10-2 

4141 

1417 

1-144 

1-104 

1416 

1-019 

>310 

1‘76 

144 

•0014 

>0800 

1  110 

1-MO 

1-467 

1-7*1 

>168 

*436 

104 

•0718 

•MOO 

1  07* 

1-220 

1-402 

1-661 

»42r. 

*•726 

10-1 

■UT.3 

•0410 

1-0*0 

1-111 

1-163 

1-682 

1417 

2-496 

IM 

0-tun 

>01M 

1-011 

1140 

1-207 

1-616 

1-820 

2-312 

10-7 

•fitij 

•£980 

O' *001 

1-111 

1-206 

1-461 

1-724 

2-16* 

144 

•7000 

4700 

■0720 

1400 

1-220 

1-408 

1*668 

2-036 

10-9 

•7017 

-000* 

*400 

1407 

1-190 

1-100 

1-690 

1427 

114 

■ictt 

•141* 

■0174 

14*0 

1-166 

1410 

1-628 

1  81* 

>3*1 

*412 

ftl*| 

>7541 

>0140 

>*00* 

1406 

1-124 

1-374 

1-472 

1-748 

>18* 

>111  . 

11-3 

•740.7 

•0080 

•8074 

04111 

1498 

1-236 

1-420 

147* 

2468 

>004 

it* 

•1217. 

•7070 

-BOOB 

■4417 

1467 

1-201 

1-371 

1-606 

1460 

>6*6 

1H 

•7145 

•7700 

4012 

-*170 

1441 

1-1*8 

1-129 

1-644 

1-866 

2-177 

114 

•7024 

•7020 

-6340 

-0174 

141* 

1-137 

1-S88 

1-489 

1-771 

2-224 

114 

0-0007 

>7501 

>010* 

OHtt 

>0016 

1-100 

1-261 

1-430 

1496 

2490 

ii-T 

■4W6 

•7271 

•8024 

•0001 

•070* 

1400 

1-216 

1-391 

1429 

I486 

11*0 

•40 66 

•7140 

-70*0 

•0420 

•949* 

1-064 

1-183 

1-347 

1468 

1-899 

IH 

•0 040 

•7120 

■774* 

•0462 

-0200 

1430 

1-162 

1-207 

1413 

1-804 

134 

•0480 

•7014 

•7*10 

-0300 

•out 

1407 

1-124 

1-270 

1-461 

1-728 

U-l 

>«2V> 

>7407 

>11*0 

>0931 

>M61 

1496 

1-236 

1-414 

1-660 

13-3 

•82YV 

4701 

-7304 

•0011 

4761 

4644 

1-071 

1-202 

1-370 

1-698 

13-3 

•Oloa 

469* 

-714* 

•7071 

•0697 

•0447 

1-046 

1-171 

1-330 

1-842 

|)4 

4111 

4600 

•7111 

•7740 

•0441 

•0260 

1424 

1-14* 

1-293 

1  400 

u-t 

•0010 

4501 

-7011 

•Till 

•2291 

•9061 

1402 

1-116 

1  268 

1-443 

134 

>0004 

>0411 

>6011 

>7411 

>0140 

>0*10 

0  9811 

1-090 

1  228 

1-399 

13-7 

•0001 

-4212 

6011 

•7*74 

•2011 

4747 

•9014 

1-066 

1-194 

1  33* 

134 

•0000 

4127 

•071* 

•7101 

■7870 

•1692 

•9427 

1441 

1-165 

1-320  1 

U9 

•0711 

•0164 

4414 

•71*1 

•7762 

-044* 

•9248 

1421 

1-138 

1  284 

134 

•4002 

4074 

4012 

•7047 

•7610 

•0299 

•9076 

1-000 

1*112 

1-231 

133 

_ 

>7400 

>8030 

>0761 

0-9014 

1064 

1-1*0 

13*4 

•7117 

•7781 

•8465 

•9302 

1421 

1-120 

13*4 

. 

4*88 

•7651 

4185 

4941 

0-9S22 

1-089 

13*3 

4802 

•7336 

•7946 

-8646 

-9460 

1-046 

144 

— 

— 

— 

— 

■6628 

•7136 

•7712 

-8173 

4141 

1006 

143 

_ 

mmm 

>0464 

04949 

>7496 

0-8121 

>9*43 

>9090 

14-4 

__ 

__ 

4111 

•0774 

•7295 

•7686 

•0600 

4369 

14*4 

_ 

4166 

400* 

•710* 

•7C69 

•8310 

-9066 

144 

_ 

4029 

-0464 

•6929 

•7464 

•0072 

-8774 

1*4 

— 

-6*00 

4306 

4762 

•7273 

•7*61 

•0614 

:st»  ittzi  xxxsx  xrssr  xxxxx  ztttt  :ss 


TABU  B4  Continued) 


Percentiles  of  the  Chi -Squared  Distribution 
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(From  G.  J.  Hahn  and  S.  S.  Shapiro,  Statistical  Models  in  Engineering,  John  Wiley  &  Sons,  New  York, 
1967,  pp.  314-315.) 


TABLE  B-8 

Percentiles  of  the  Maximum  Absolute 
Difference  Between  Sample  and 
Population  Cumulative  Distributions41 


Sample 

cite 

m 

Level  of  Munificence  (a) 

0.20 

0.15 

0.10 

0.05 

0.01 

i 

0.900 

0.925 

0.950 

0.975 

0.995 

2 

0.6S4 

0.726 

0.776 

0.842 

0.929 

3 

0.5C5 

0.597 

0.642 

0.708 

0.828 

4 

0.494 

0.525 

0.501 

0.634 

0.733 

5 

0.446 

0.474 

0.510 

0.565 

0.669 

0 

0.410 

0.436 

0.470 

0.621 

0.018 

7 

0.3S1 

0.405 

0.433 

0.486 

0.677 

S 

0.353 

0.3S1 

0.411 

0.457 

0.543 

9 

0.339 

0.3G0 

0.3S8 

0.432 

0.514 

10 

0.332 

0.342 

0.3GS 

0.410 

0.490 

1) 

0.307 

0.320 

0.352 

0.391 

0.468 

12 

0.295 

0.313 

0.33S 

0.375 

0.450 

13 

0.284 

0.803 

0.325 

0.361 

0.433 

14 

0.274 

0.293 

0.314 

0.349 

0.418 

15 

0.2G6 

0.2S3 

0.304 

0.338 

0.404 

16 

0.25S 

0.274 

0.295 

0.328 

0.892 

17 

0.250 

0.2G6 

0.2S6 

0.318 

0.381 

IS 

0.244 

0.259 

0.27S 

0.309 

0.371 

19 

0.237 

0.252 

0.272 

0.301 

0.303 

20 

0.231 

0.240 

0.264 

0.294 

0.356 

25 

0.21 

0.22 

0.24 

0.27 

0.32 

80 

0.19 

0.20 

0.22 

0.24 

0.29 

35 

0.18 

0.19 

0.21 

0.23 

0.27 

over  35 

1.07 

1.14 

1.22 

1.36 

1.03 

V?? 

VR 

Values  of  da(N)  such  that 

PrjmaxjS^x)  -  FQ(x)|  >da(N)]  =  a, 

where  Fq(x)  is  the  theoretical  cumulative  distri¬ 
bution  ana  S^(x)  is  an  observed  cumulative  dis¬ 
tribution  for  a  sample  of  N  observations. 


(From  F.  J.  Massey,  "The  Kolmogorov- Smirnov  Test  for  Goodness 
of  Fit,  ”  J.  Amer.  Stat.  Ass.  46:  70  (1951). ) 
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TABLE  B-7 

Table  erf  Coefficients  [*4i_i+i]  Used  in  the  W  Test 

For  Normality 


1 

0.7117 1 

0.6872 

0.6616 

0.6431 

0,6233 

0.6052 

0.5X89 

0.5739 

0.5601 

0.5475 

0.5359 

0.5251 

0.5150 

0.5056 

0.496S 

0.4X86 

0.1677 

0.2413 

0.2X06 

0.3031 

0.3164 

0.3244 

0.3291 

0.3315 

0.3325 

0.3325 

0.3318 

0.3306 

0.32'W 

0.327.7 

0.3753 

.1 

0.0S73 

0.1401 

0.1743 

0.1976 

0.2141 

0.27MI 

0.2347 

0.2412 

0.2460 

0.2495 

0.2521 

0.7540 

0.7553 

4 

0.0561 

0.0947 

0.1224 

0.1429 

0.1586 

0.1707 

0.1X02 

0.1878 

0.1939 

0.1988 

0.2027 

5 

0.0399 

0.0695 

0.C922 

0.1099 

0.1240 

0.1353 

0.1447 

0.1524 

0.1587 

f» 

0.0303 

0.0539 

0.0727 

0.0880 

o.ioot 

(1.1 109 

0.1197 

7 

0.0240 

0.0433 

0.059.7 

0.0725 

0.0X37 

S 

0.0196 

0.0359 

0.0196 

V 

0.01(3 

\ 

(I 

v  " 

20 

21  • 

22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

i 

«  4  M'S 

0.4734 

0.4643 

0.4590 

0.4542 

0.4493 

0.4450 

0.4407 

0.4366 

0.4328 

0.4291 

0.4754 

0.4220 

0.4188 

0  4156 

0  4127 

* 

o  37.12 

0.7211 

0.313$ 

0.3156 

0.3126 

0.3091 

0.3069 

0.3043 

0.3018 

0.2992 

0.7908 

0.2y44 

0.2921 

0.2X9F 

0.7S76 

0.2854 

j 

0  2.V.I 

0.2565 

0  '$7* 

0.2571 

0.7563 

0.2554 

0.2543 

0.2533 

0.2522 

0.2SI0 

0.2499 

0.74X7 

0.2475 

0.2467 

0.7.451 

0.24.79 

4 

tl.7059 

0.20P5 

0.2119 

0.7131 

0.2139 

0.2145 

0.2148 

0.2151 

0.2152 

0.2151 

0.2150 

0.2I4S 

0.21  15 

0.2141 

0.2137 

0.2132 

5 

0  1641 

o.i  r.K6 

0.1736 

0.1764 

0.17X7 

0.1807 

0.1822 

0.1X36 

0.1848 

0.1857 

0.1861 

0.1X70 

0.1  S74 

0.1878 

0.18X0 

0.1882 

6 

a  in 

0.1334 

0.1399 

0.1443 

O.I4fO 

0.1512 

0.1539 

0.1563 

0.1 5X4 

0.1601 

0.1016 

0.1670 

0.1611 

0.1651 

0  1660 

0.1 607 

7 

II  0/32 

O.IOI3 

0.1092 

0.1150 

0.1201 

0.1245 

0.1283 

0.1316 

0.1346 

0.1372 

0.1395 

0.1415 

0.143.7 

0.1419 

0.1463 

0.1475 

« 

11.10.12 

0.0711 

0.0804 

0.0S7X 

0.0541 

0.0997 

0.1046 

O.IOS9 

0.1 12S 

0.1162 

0.1192 

0.1219 

0.1743 

0.176$ 

0.17X4 

0.1301 

*1 

II  lltll.1 

0.0422 

0.0530 

0.0418 

0.06*16 

0.0764 

00823 

O.OS76 

0.0923 

0.096$ 

0.1007 

0  1036 

O.IO06 

0.1093 

0  1 1  IS 

0.1140 

l» 

0.0140 

0.0263 

0.0368 

0.0159 

0.0539 

0.0610 

0.06.72 

0.0728 

0.0778 

O.OS22 

0.0862 

0.0X99 

0.0931 

0.0461 

0.09SS 

II 

0.0122 

0.0228 

0.0321 

0.0403 

0.0476 

0.0540 

0.059S 

0.0650 

O.Ot.97 

0.0739 

0  0-*77 

0.0812 

0.0844 

1.’ 

0.0107 

0.0200 

0.0284 

0.0358 

0.0424 

0.0483 

O.OS37 

0.05S5 

0.0629 

0.0669 

0.0706 

M 

0.0094 

0.0178 

0.0253 

00320 

0.03X1 

0.0435 

0.01X5 

0.0530 

0.0572 

14 

0.0084 

0.0159 

0.0227 

0.07K9 

0.03 14 

0.0395 

0.0141 

It 

0.0076 

0.0144 

0.0706 

0.0262 

0.0314 

It. 

0.P06K 

0.0131 

0.01X7 

r 

0.0062 

( 

>5 

M 

37 

38 

39 

40 

41 

42 

43 

44 

45 

46 

47 

48 

49 

50 

1 

0.4IW6 

0.4068 

0.4040 

0.401$ 

0.3989 

0.3964 

0.3940 

0.3917 

0.3894 

0.7872 

0.3850 

0.3X30 

0.3808 

0.3789 

0.3770 

0  3751 

* 

0.2X34 

0  281.7 

0.2794 

0.2774 

0.2755 

0.2737 

0.2719 

0.2701 

0.2684 

0.2667 

0.2651 

0.2635 

0.2620 

0.26.04 

0.25X9 

0.2574 

3 

0.2477 

0.241$ 

0.2403 

0.2391 

0.2380 

0  2368 

0.2357 

0.2345 

0.2334 

0.2323 

0.2313 

0.7302 

0.2291 

0.7281 

0  2771 

0.7260 

4 

0.2127 

0.2121 

0.2116 

0.2110 

0.2104 

0.2098 

0.2091 

0.2085 

0.2078 

0.2072 

0.2065 

0.20S8 

0.7052 

0.20  It 

0  2078 

o  7i>'2 

5 

O.IKX.7 

0.1883 

0.1883 

0.1X81 

0.18X0 

0.1878 

0.1876 

0.1  S74 

0.1871 

0.1668 

0.1865 

0.1862 

0.1859 

0.1855 

0.1851 

0.1X47 

6 

0.1673 

0.1678 

0.1683 

0.1656 

0.1 6S9 

0.1691 

0.1693 

0.1694 

0.1695 

0.1695 

0.169$ 

0.1695 

0.1695 

0.1 6*1.7 

0.1692 

0  1691 

7 

0.1487 

0.1496 

0.1505 

0.1513 

0.1520 

0.1526 

0.1531 

0.1575 

0.15.79 

0.1542 

0.154$ 

0.1548 

0.1550 

0.1551 

0.1553 

0  1  "1 

8 

0.1317 

0.1331 

0.1344 

0.1356 

0.1366 

0.1376 

0.1384 

0.1392 

0.1398 

0.I40S 

0.1410 

0.1415 

0.1420 

0.1423 

(1.1427 

0.  Il>0 

9 

0.1 160 

0.1179 

0.1196 

0.1211 

41.1225 

0.1237 

0.1249 

0.1259 

0.1269 

0.1278 

0.1286 

0.1293 

0.1300 

0.1306 

0.1312 

0  1317 

III 

0  1013 

0.1036 

0.1056 

0.107$ 

'  0.‘l092 

0.1108 

0.1123 

0.1136 

0.1149 

0.1160 

0.1170 

0.1180 

0.1IS9 

0.1197 

0.1205 

0  1717 

II 

0.0873 

0.0400 

0.0924 

0.0947 

0.0967 

0.0986 

0.1004 

0.1020 

0.1035 

0.1049 

0.1062 

0.1073 

0.1085 

0.1095 

0.1105 

Hill' 

12 

O.H739 

0.0770 

0.0798 

0.0824 

0.0818 

0.0X70 

0.0891 

0.0909 

0.0927 

0.0943 

0.0959 

0.0972 

0.0986 

0.0/98 

0.1  om 

0  10. 'll 

11 

0.061(1 

0  0645 

0.0677 

0.0706 

0.0733 

0.0759 

0.0782 

0.0X04 

O.OS24 

0.0842 

0.0860 

0.0876 

0.0892 

0.0*106 

(1,09 1*1 

0  09 12 

14 

IUMS4 

0.0523 

0.0559 

0.0592 

0.0622 

0.0051 

0.0677 

0.0701 

0.0724 

0.074$ 

0.076$ 

0.07X3 

0.0801 

0.0X17 

O.OS32 

II  ll.vlo 

15 

O.lHol 

0.0404 

0.0444 

0.0481 

0.0515 

0.0546 

0.057$ 

0.0602 

0.0678 

0.0651 

0.0673 

0.06.94 

0.0713 

0.0731 

(1.0748 

llll'c.l 

li» 

011239 

0.02X7 

0.0331 

0.0372 

0.0409 

0.0444 

0.0476 

0.0506 

0.0534 

0.0560 

0.05S4 

0.0607 

0.062S 

0.06  IS 

0.0667 

0  061.5 

17 

0.0119 

0  0172 

0.0220 

00264 

0.0305 

0.0343 

0.0379 

0.0111 

0.041? 

0.0471 

0.0497 

0.0522 

0.0546 

o.(»t.s 

00588 

IIIW4IS 

IK 

0.0057 

0.01 10 

0.0158 

0.020.7 

0.0244 

0.0283 

0.0318 

0.0.752 

0.0J53 

0.0117. 

0.0439 

0.0465 

0.04X9 

0  0511 

11.115,1  ? 

14 

0.0053 

0.0101 

0.0146 

0.01X8 

0.0727 

0.0263 

(1.0296 

0.0328 

(1.0357 

0.0385 

0.04 11 

0  0116 

I1HI59 

?ll 

0.0049 

0.0094 

0.0136 

0.0175 

0.0211 

0.024$ 

00777 

0.0307 

0.0335 

0.0 361 

0.1*150 

21 

0.0015 

0.0OS7 

0.0126 

0.016.7 

0  0197 

0.0229 

0.0759 

0  02X8 

0  0714 

72 

0.0012 

0.0081 

.  0.0118 

0.0153 

0.018$ 

0  0215 

0  074  1 

23 

0.00.79 

0.0076 

0.01 1 1 

O.OI43 

0.0174 

24 

0.0037 

0  0071 

0.0104 

25  0.0035 


(From  G.  J.  Hahn  and  S.  S.  Shapiro,  Statistical  Methods  in  Eneineerine. 

John  Wiley  &  Sons,  New  York,  1967,  pp.  330-331.)  ® - 
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TABLE  B-8 

Percentage  Points  of  the  W  Statistic 


If 

1 

2 

5 

10 

so 

3 

0.753 

0.756 

0.767 

0.789 

0.959 

4 

0.687 

0.707 

0.748 

0.792 

0.935 

5 

0.6S6 

0.715 

0.762 

0.806 

0.927 

6 

0.713 

0.743 

0.788 

0.826 

0.927 

7 

0.730 

0.760 

0.803 

0.838 

0.928 

8 

0.749 

0.778 

0.818 

0.851 

0.932 

9 

0.764 

0.791 

0.829 

0.859 

0.935 

10 

0.781 

0.806 

0.842 

0.869 

0.938 

11 

0.792 

0.817 

0.850 

0.876 

0.940 

12 

0.805 

0.828 

0.859 

0.883 

0.943 

13 

0.814 

0.837 

0.866 

0.889 

0.945 

14 

0.825 

0.846 

0.874 

0.895 

0.9-17 

13 

0.835 

0.855 

0.881 

0.901 

0.950 

16 

0.644 

0.663 

0.887 

0.906 

0.952 

17 

0.851 

0.869 

0.892 

0910 

0.954 

18 

0.858 

0.874 

0.897 

0.914 

0.956 

19 

0.863 

0.879 

0.901 

0.917 

0.957 

20 

0.668 

0.884 

0.905 

0.920 

0.959 

21 

0.873 

0.888 

0.908 

0.923 

0.960 

22 

0.878 

0.892 

0.911 

0.926 

0.961 

23 

0.881 

0.895 

0.9)4 

0.928 

0.962 

24 

0.RS4 

0.896 

0.916 

0.930 

0.963 

25 

0.888 

0.901 

0.918 

0.931 

0.964 

26 

0.891 

0.904 

0.920 

0.933 

0.965 

27 

0.894 

0.906 

0.923 

0.935 

0.965 

28 

0.896 

0.908 

0.924 

0.936 

0.966 

29 

0.898 

0.910 

0.926 

0.937 

0.966 

30 

0.900 

0.912 

0.927 

0.939 

0.967 

31 

0.902 

0.914 

0.929 

0.940 

0.967 

32 

0.904 

0.915 

0.930 

0.941 

0.968 

33 

0.906 

0.917 

0.931 

0.942 

0.968 

34 

0.908 

0.919 

0.933 

0.943 

0.969 

35 

0.910 

0.920 

0.934 

0.944 

0.969 

36 

0.912 

0.922 

0.935 

0.945 

0.970 

37 

0.914 

0.924 

0.936 

0.946 

0.970 

38 

0.916 

0.925  - 

0.938 

0.947 

0.971 

39 

0.917 

0.927 

0.939 

0.948 

0.971 

40 

0.919 

0.928 

0.940 

0.949 

0.972 

41 

0.920 

0.929 

0.941 

0.950 

0.972 

42 

0.922 

0.930 

0.942 

0.951 

0.972 

43 

0.923 

0.932 

0.943 

0.951 

0.973 

44 

0.924 

0.933 

0.944 

0.952 

0.973 

45 

0.926 

0.934 

0.945 

0.953 

0.973 

46 

0.927 

0.935 

0.945 

0.953 

0.974 

47 

0.928 

0.936 

0.946 

0.954 

0.974 

48 

0.929 

0.937 

0.947 

0.954 

0.974 

49 

0.929 

0.937 

0.947 

0.955 

0.974 

50 

0.930 

0.938 

0.947 

0.955 

0.974 

(From  G.  J.  Hahn  and  S.  S.  Shapiro,  Statistical  Methods  in 
Engineering,  John  Wiley  &  Sons,  New  York,  1967,  p.  332.) 
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TABLE  B-9 

Percentage  Points  For  the  WE 
Statistic 


n 

95  ^  Range 

90",  Range 

Lower 

Point 

Upper 

Point 

Lower 

Point 

Upper 

Point 

7 

0.062 

0.404 

0.071 

0.358 

8 

0.054 

0.342 

0.062 

0.301 

9 

0.050 

0.301 

0.058 

0.261 

10 

0.049 

0.261 

0.056 

0.231 

11 

0.046 

0.234 

0.052 

0.208 

12 

0.044 

0.215 

0.050 

0.191 

13 

0.040 

0.195 

0.046 

0.173 

14 

0.038 

0.178 

0.043 

0.159 

15 

0.036 

0.163 

0.040 

0.145 

16 

0.034 

0.150 

0.038 

0.134 

17 

0.030 

0.135 

0.034 

0.120 

18 

0.028 

0.123 

0.031 

0.109 

19 

0.026 

0.114 

0.029 

0.102 

20 

0.025 

0.106 

0.028 

0.095 

21 

0.024 

0.101 

0.027 

0.091 

22 

0.023 

0.094 

0.026 

0.0S4 

23 

0.022 

0.087 

0.025 

0.078 

24 

0.021 

0.082 

0.024 

0.074 

25 

0.021 

0.078 

0.023 

0.070 

26 

0.020 

0.073 

0.022 

0.066 

27 

0.020 

0.070 

0.022 

0.063 

28 

0.019 

0.067 

0.021 

0.061 

29 

0.019 

0.064 

0.021 

0.058 

30 

0.018 

0.060 

0.020 

0.054 

31 

0.017 

0.057 

0.019 

0.052 

32 

0.017 

0.055 

0.019 

0.050 

33 

0.017 

0.053 

0.0  IS 

0.048 

34 

0.017 

0.051 

0.018 

0.047 

35 

0.016 

0.049 

0.018 

0.045 

(From  G.  J.  Hahn  and  S.  S.  Shapiro,  Statistical  Methods  in 
Engineering,  John  Wiley  &  Sons,  New  York,  1967,  p.  335.) 


TABLE  B-10 


Percentage  Points  For  the  WE 
Statistic  ° 


it 

vj  „ 

R.mjjc 

vo;. 

R:.nt;e 

Lower 

I’oint 

Upper 

Point 

Lower 

Point 

Upper 

Point 

7 

0.025 

0.260 

0.033 

0.225 

8 

0.025 

0.230 

0.032 

0.200 

9 

0.025 

0.205 

0.031 

0.177 

10 

0.025 

0.184 

0.030 

0.159 

11 

0.025 

0.166 

0.030 

0.145 

12 

0.025 

0.153 

0.029 

0.134 

13 

0.025 

0.140 

0.028 

0.124 

14 

0.024 

0.128 

0.027 

0.115 

15 

0.024 

0.119 

0.026 

0.106 

16 

0.023 

0.113 

0.025 

0.09S 

17 

0.023 

0.107 

0.024 

0.093 

18 

0.022 

0.101 

0.024 

0.087 

19 

0.022 

0.096 

0.023 

0.083 

20 

0.021 

0.090 

0.023 

0.077 

21 

0.020 

0.085 

0.022 

0.074 

22 

0.020 

o.oso 

0.022 

0.069 

23 

0.019 

0.075 

0.021 

0.065 

24 

0.019 

0.069 

0.021 

0.062 

25 

0.018 

0.065 

0.020 

0.058 

26 

0.018 

0.062 

0.020 

0.056 

27 

0.017 

0.058 

0.020 

0.054 

28 

0.017 

0.056 

0.019 

0.052 

29 

0.016 

0.054 

0.019 

0.050 

30 

0.016 

0.053 

0.019 

0.048 

31 

0.016 

0.051 

0.018 

0.047 

32 

0.015 

0.050 

0.018 

0.045 

33 

0.015 

0.048 

0.018 

0.044 

34 

0.014 

0.046 

0.017 

0.0-13 

35 

0.014 

0.045 

0.017 

0.041 

(From  G.  J.  Hahn  and  S.  S.  Shapiro,  Statistical  Methods  in 
Engineering,  John  Wiley  &  Sons,  New  York,  1967,  p.  334.) 
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Bain,  L.J. ,  and  C.  E.  An  tie,  "Estimation  of  Parameters  in 
the  Weibull  Distribution",  Technometrics  9(4):621-627  (1967). 

A  new  method  of  estimation  is  used  to  obtain  two  simple  esti¬ 
mators  of  the  parameters  in  a  Weibull  distribution.  These 
estimators  are  similar  to  the  estimators  given  by  Gumbel, 
Miller  and  Freund,  and  Menon.  Monte  Carlo  methods  were 
used  to  determine  the  variances  and  biases  of  the  estimators 
for  various  sample  sizes.  Comparisons  of  the  estimators 
can  be  made  and  unbiasing  factors  calculated  in  some  cases. 

Bhattacharya,  P.  K. ,  "Efficient  Estimation  of  a  Shift  Parameter 
From  Grouped  Data",  Ann.  Math.  Statist.  38:1770-1787  (1967). 

This  paper  considers  two  populations  having  frequency  functions 
f(x)  and  f(x-0)  where  the  common  form  f  and  the  shift  param¬ 
eter  0  are  unknown.  A  method  of  estimating  0  when  one  sample 
is  reduced  to  a  frequency  distribution  over  a  given  set  of  class - 
intervals  is  suggested  by  the  likelihood  principle  and  the  asymp¬ 
totic  efficiency  of  this  estimator  relative  to  the  appropriate 
maximum  likelihood  estimator  based  on  the  complete  data  is 
found  to  be  the  ratio  of  the  Fisher -information  in  a  grouped 
observation  to  the  Fisher-information  in  an  ungrouped  observa¬ 
tion. 

Birnbaum,  Z.  W. ,  Probability  and  Mathematical  Statistics, 
Harper  &  Brothers,  New  York  (1962). 

General  theory  of  tests  of  statistical  hypotheses  is  presented 
aloug  with  a  detailed  discussion  of  the  Chi-squared  distribution 
and  test.  Also  distribution  free  tests  are  discussed  including 
the  Kolmogorov  test  and  Smirnov  test.  Also  included  are  the 
likelihood  function  and  likelihood  ratio  statistics. 

Brunk,  H.  D. ,  Mathematical  Statistics ,  Blaisdell  Publishing 
Co.,  Waltham,  Massachusetts  (1965). 

Basic  theory  of  testing  hypotheses  is  presented  including  a 
discussion  of  testing  a  simple  hypothesis  against  a  simple  al¬ 
ternative,  choice  of  null  hypothesis,  the  power  function,  most 
powerful  tests  and  consistent  tests.  Specific  tests  described 
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are  Chi -squared  test,  Kolmogorov -Smirnov  test  for  goodness 
of  it,  t-test,  F-test,  runs  test,  median  test,  and  likelihood 
ratio  test. 

5.  Choi,  S.C. ,  and  R.  Wette,  "Maximum  Likelihood  Estimation 
of  the  Parameters  of  the  Gamma  Distribution  and  Their  Bias", 
Technometrics  U(4):683-690  (1969). 

The  maximum  likelihood  method  is  recommended  for  estimating 
the  parameters  of  a  gamma  distribution.  Numerical  techniques 
for  carrying  out  the  calculation  are  examined.  A  convenient 
table  is  obtained  to  facilitate  the  estimation  of  parameters. 

The  bias  of  the  estimates  is  investigated  by  Monte  Carlo;  the 
indication  is  that  the  bias  of  both  parameter  estimates  pro¬ 
duced  by  the  maximum  likelihood  method  is  positive. 

6.  Cornell,  R.G. ,  andJ.A.  Speckman,  "Estimation  for  a  Simple 
Exponential  Model",  Biometrics  23:717-737  (1967). 

Graphical,  maximum  likelihood,  least  squares,  weighted  least 
squares,  partial  totals,  moment,  finite  differences,  Fisher, 
and  Spearman  estimation  procedures  are  presented  for  estima¬ 
ting  the  parameter  X  in  the  exponential  model  with  expectations 
given  by  1  -  e“*T  for  different  values  of  T.  The  estimators 
are  described,  referenced,  illustrated,  and  compared.  Tables 
are  cited  which  make  several  of  the  estimation  procedures 
easier  computationally.  Included  in  the  comparison  of  the 
estimators  is  a  review  of  some  Monte  Carlo  computations. 

The  method  of  maximum  likelihood,  which  can  be  used  for 
any  spacing  of  T-values,  has  very  desirable  large  sample  prop¬ 
erties.  The  simple  method  of  partial  totals  is  a  possible  alter¬ 
native  for  small  samples  of  equally  spaced  T-values  while  the 
Fisher  and  Spearman  method  are  suggested  alternatives  for 
T-values  whose  logarithms  are  equally  spaced. 

7.  C rimer,  H. ,  Mathematical  Methods  of  Statistics,  Princeton 
University  Press,  Princeton  (1945). 

Chapter  30  of  this  book  describes  "goodness  of  fit"  statistical 
tests.  The  two  tests  described  in  detail  are  the  Chi-squared 
test  and  C rimer -von  Mises  test.  However,  statistics  for  the 
Cr&mer  yon  Mises  test  and  examples  are  not  presented. 
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8.  Dubey,  S.  D. ,  "On  Some  Permissible  Estimators  of  the  Location 
Parameter  of  the  Weibull  and  Certain  Other  Distributions", 
Technometrics  9(2):293-307  (1967). 

An  estimator  for  the  location  parameter  of  the  Weibull  distri¬ 
bution  is  proposed  which  is  independent  of  its  shape  and  scale 
parameters.  Several  properties  of  this  estimator  are  estab¬ 
lished  which  suggest  a  proper  choice  of  three  ordered  sample 
observations  insuring  a  permissible  estimate  of  the  location 
parameter.  This  result  is  valid  for  every  distribution  which 
has  the  location  parameter  acting  as  the  origin  or  threshold 
parameter.  Asymptotic  properties  of  such  an  estimator  of 
the  location  parameter  of  the  Weibull  distribution  is  discussed. 
Finally  the  paper  contains  a  brief  discussion  on  a  percentile 
estimator  of  the  location  parameter  of  the  Weibull  distribution 
and  includes  some  numerical  illustrations. 

9.  Elandt,  R.C.,  "The  Folded  Normal  Distribution:  Two  Methods 
of  Estimating  Parameters  From  Moments",  Technometrics 
3(4):551-562  (1961). 

The  general  formula  for  the  r01  moment  of  the  folded  normal 
distribution  is  obtained,  and  formulae  for  the  first  four  non¬ 
central  and  central  moments  are  calculated  explicitly.  Two 
methods,  one  using  first  and  second  moments  of  the  sample 
and  the  other  using  second  and  fourth  moments,  of  estimating 
the  parameters  of  the  parent  distribution  are  presented  and 
their  standard  errors  calculated.  The  accuracy  of  both  methods 
is  discussed. 

10.  Elderton,  W.P.,  Frequency  Curves  and  Correlation,  4th  Ed., 
Cambridge  University  Press,  Cambridge,  (1953). 

A  thorough  covering  of  the  Pearson  system.  Describes  each 
type  of  distribution  and  gives  relevant  formulae  for  the  type 
of  curve. 

11.  El-Sayyad,  G.  M. ,  "Information  and  Sampling  from  the  Expo¬ 
nential  Distribution",  Technometrics  ll(l):41i45  (1969). 

Methods  of  sampling  an  exponential  population  in  order  to  obtain 
a  prescribed  accuracy  in  the  determination  of  the  unknown 
parameter  are  discussed.  The  concept  of  information  due  to 
Shannon  is  used  and  it  leads  to  well-known  schemes. 
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Gnanadesikan,  R.,  R.S.  Pinkham,  andL.P.  Hughes,  "Maxi¬ 
mum  Likelihood  Estimation  of  the  Parameters  of  the  Beta 
Distribution  from  Smallest  Order  Statistics ",  Technometrics 
9(4):607-620  (1967). 

Numerical  methods,  useful  with  high-speed  computers  are 
described  for  obtaining  the  maximum  likelihood  estimates  of 
the  two  parameters  of  a  beta  distribution  using  the  smallest 
M  observations,  0  <  ui  <  U2  <. . .  <. .  um>  in  a  random  sample 
of  size  K  (*  M).  The  maximum  likelihood  estimates  are  func¬ 
tions  only  of  the  ratio  R  +  M/K,  the  Mth  ordered  observation, 

M  1/M  M 

uM,  and  the  two  statistics,  G  =  [n^u  ]  and  Gg  =  [ni=1 

1/M 

(1-Uj)]  .  For  the  case  of  the  complete  sample  (R  =  1), 

however,  the  estimates  are  functions  only  of  Gi  and  G2,  and 
hence,  for  this  case,  explicit  tables  of  the  estimates  are  pro¬ 
vided. 

Some  examples  are  given  of  the  use  of  the  procedures  described 
for  fitting  beta  distributions  to  sets  of  data. 

Govindarajulu,  Z.,  "Certain  General  Properties  of  Unbiased 
Estimates  of  Location  and  Scale  Parameters  Based  on  Ordered 
Observations",  SIAM  J.  App.  Math.  16(3):533-551  (1968). 

Some  upper  bounds  are  derived  for  the  variances  of  least  squares 
estimators  based  on  a  subset  of  the  ordered  observations  in 
a  random  sample  of  (i)  location,  (ii)  scale,  and  (iii)  both  loca¬ 
tion  and  scale  parameters  of  a  distribution. 

Gumbel,  E.J. ,  "Statistical  Theory  of  Extreme  Values  and 
Some  Practical  Applications",  National  Bureau  of  Standards, 
Applied  Math  Series  33,  (Feb.  1954). 

Hahn,  G.J. ,  andS.S.  Shapiro,  Statistical  Models  in  Engineering, 
John  Wiley  and  Sons,  New  York,  1967  (1967). 

Discusses  many  continuous  and  discrete  distributions.  Gives 
functional  form,  discusses  theoretical  basis,  and  mentions 
applications.  In  some  cases  describes  parameter  estimation 


techniques.  Discusses  advantages  to  fitting  data  to  empirical 
distributions.  Describes  Johnson  system  and  displays  plot 
of  /3j, /So  values.  Fitting  procedures  for  Johnson  distributions 
are  outlined  and  examples  are  given.  Describes  Pearson 
system  of  distributions  and  displays  plot.  Does  not 
attempt  to  describe  Pearson  fitting  procedures. 

Discusses  general  techniques  of  goodness  of  fit  tests.  Two 
procedures  are  discussed:  a  series  of  tests  developed  by 
Shapiro  and  Wilk,  known  as  W  tests  (including  the  WE  test), 
and  the  Chi-squared  goodness  of  fit  test.  The  W  tests  are 
used  to  evaluate  the  assumption  of  a  normal  and  exponential 
distribution  for  a  set  of  data.  The  procedures  for  using  these 
techniques  are  presented  in  a  detailed  step-by-step  manner. 

Haight,  F.A. ,  Index  to  Distributions  of  Mathematical  Statistics, 

J.  Res.  Natl.  Bureau  Stand.  -  B.  Math,  and  Math.  Phys  65B 

(l):23'-66  (mi): - 

A  fairly  complete  index  of  references  to  results  on  statistical 
distributions  published  before  January  1958  is  presented. 

The  material  given  for  each  distribution  is  a  list  of  references 
relating  to:  (a)  functions  and  constants  which  characterize 
the  distribution,  (b)  derived  distributions,  (c)  estimation, 

(d)  testing  statistical  hypotheses,  and  (e)  miscellaneous. 

The  distributions  covered  are  characterized  as  normal,  type 
m,  binomial,  discrete,  distributions  over  (a,b),  distributions 
over  (a,«0,  distributions  over  (-»,«),  miscellaneous  univariate, 
miscellaneous  bivariate,  and  miscellaneous  multivariate. 

The  number  of  entries  varies  from  one  or  two  for  less  well- 
known  distributions  to  several  hundred  for  the  normal  distri¬ 
bution. 

Harter,  H.L. ,  "Maximum-Likelihood  Estimation  of  the  Param¬ 
eters  of  a  Four -Parameter  Generalized  Gamma  Population 
From  Complete  and  Censored  Samples",  Technometrics  9 
(1):159-165  (1967). 

The  four -parameter  generalized  gamma  distribution  includes 
such  distributions  as  the  usual  three -parameter  gamma,  the 
Weibull,  the  exponential,  and  the  half  normal.  For  these  dis¬ 
tributions  this  paper  develops  the  maximum  likelihood  equations. 
Iterative  computer  techniques  are  needed  to  solve  these  equations. 
Some  results  of  applying  this  to  various  distributions  are  pre¬ 
sented. 
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18.  Harter,  H.L. ,  "A  New  Table  of  Percentage  Points  of  the  Pearson 
Type  m  Distribution”,  Technometrics  11(1):177-187  (1969). 

A  table  of  percentage  points  for  the  type  m  Pearson  distribution. 

19.  Hodges,  J.L. ,  Jr.  and  E.L.  Lehmann,  "A  Compact  Table 
For  Power  of  the  t-Test",  Ann.  Math.  Statist,  39.  No.  5 
(1968) 

The  paper  gives  a  one -page  table  for  t -power  which  covers 
any  values  of  the  (one-sided)  significance  level  a  in  the  range 
from  0.005  to  0.1,  any  value  of  the  second -type  error  probability 
/3  in  the  range  from  0. 01  to  0. 5;  and  any  number  of  degrees 
of  freedom  greater  than  2.  The  table  gives  reasonably  accurate 
answers  without  iteration  and  using  only  linear  interpolation. 

Eight  examples  are  provided  which  illustrate  a  variety  of  t-power 
problems. 

20.  Hogg,  R.V.  andA.T.  Craig,  Introduction  to  Mathematical 
Statistics,  the  Mac  Millan  Company,  New  York  (1965). 

Includes  chapters  on  order  statistics,  sufficient  statistics, 
statistical  hypotheses  and  statistical  tests.  It  provides  the 
theoretical  basis  of  the  Chi-square  tests  and  Bayesian  tests. 

It  also  describes  Likelihood  Ratio  tests  and  the  sequential 
probability  ratio  test. 

21.  Johnson,  N.  L. ,  "Systems  of  Frequency  Curves  Generated  by 
Methods  of  Translation",  Biometrlka  36:149-176  (1949). 

Introduces  Johnson  system  of  distributions.  Reviews  literature 
on  systems  of  distributions.  Provides  a  theoretical  background 
to  Johnson  system.  Compares  Johnson  and  Pearson  systems 
for  skewness  and  kurtosis  values.  Gives  some  numerical  ex¬ 
amples. 

22.  Johnson,  N.L. ,  "Tables  to  Facilitate  Fitting  STT  Frequency 
Curves",  Biometrika  52:547  (1965). 

In  fitting  empirical  data  to  a  distribution  from  the  Johnson 
family,  one  usually  adjusts  the  parameters  of  the  Johnson 
distribution  to  match  the  first  four  moments  of  the  original 
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data.  However,  given  the  first  four  moments  it  is  not  a  trivial 
problem  to  calculate  the  correct  Johnson  parameters.  This 
paper  provides  tables  from  which  the  Johnson  parameters  can 
be  obtained. 

23.  Johnson,  N.L. ,  and  S.  Katz,  Distributions  in  Statistics:  Dis¬ 
crete  Distributions,  Houghton-Mifflin  Co.,  Boston, (1969). 

Thorough  covering  of  all  known  discrete  distributions.  Gives 
functional  form,  moments,  and  other  information  and  discusses 
the  estimation  of  parameters  for  each  distribution. 

24.  Johnson,  N.L. ,  and  S.  Katz,  Distributions  in  Statistics:  Con¬ 
tinuous  Univariate  Distributions,  Vol.  1  and  2,  Houghton  -Mifflin 
Co.,  Boston,  (197(9. 

Thorough  covering  of  all  known  continuous  distributions  (except 
empirical  families).  Gives  functional  form,  moments,  and 
other  information  and  discusses  the  estimation  of  parameters 
for  each  distribution. 

25.  Johnson,  N.L. ,  E.  Nixon,  D.  E.  Amon,  and  E.  S.  Pearson, 
"Table  of  Percentage  Points  of  Pearson  Curves",  for  given 
y0l  and  fa,  expressed  in  standard  measure",  Biometrika  50: 
459-498  (1963). 

For  the  general  Pearson  system  of  distributions,  this  paper 
gives  tables  of  percentiles  (or  solutions  of  the  inverse  equation) 
as  a  function  of  skewness  and  kurtosis. 

26.  Kagan,  A.  M. ,  "Estimation  Theory  for  Families  with  Location 
and  Scale  Parameters  and  For  Exponential  Families",  Proc. 
Steklov.  Inst.  Math.  104:19-87  (1968). 

This  theoretical  paper  investigates  families  of  distributions 
and  estimators.  The  conditions  for  admissible  estimators  are 
discussed. 

27.  Kendall,  M.G. ,  andA.S.  Stuart,  The  Advanced  Theory  of 
Statistics,  Vol.  1,  Distribution  Theory,  Charles  Griffen  & 

Co.  (1656). 


28.  Kodlin,  D. ,  "A  New  Response  Time  Distribution'*,  Biometrics 
23:227-239  (1967). 

A  skewed,  two-parameter  distribution  is  described  which  has 
been  found  useful  in  the  analysis  of  human  survival  time  data. 

-(ct+^kt2) 

The  density  has  the  form  f(t)  =  (c+kt)e  .  This  form 

is  integrable  and  has  manageable  first  and  second  moments. 

Since  the  distribution  has  non-zero  density  at  the  origin,  it 
may  be  of  value  in  connection  with  those  types  of  responses 
which  take  place  even  before  observation  begins.  Description 
of  a  maximum  likelihood  technique  of  estimating  the  parameters 
is  followed  by  discussion  of  damage  models  that  incorporate 
the  distribution. 

29.  Langton,  N.H. ,  "Statistical  Distribution",  Brit.  Chem,  Engr. 
8:478-484  (1963). 

This  paper  is  an  elementary  article  which  gives  the  basic 
concepts  and  formulae  characterizing  probability  distributions 
and  sampling.  It  discusses  the  binomial,  Poisson,  and  normal 
distributions  and  the  fitting  of  empirical  data  to  these  distri¬ 
butions  using  moments  method. 

30.  Malik,  H.  J. ,  "Estimation  of  the  Parameters  of  the  Pareto 
Distribution",  Metrika  15:126-136  (1970). 

In  this  paper,  sufficient  estimators  for  the  parameters  a  and 
v  of  the  Pareto  distribution  are  obtained.  It  is  shown  that 
Y..  =  Min  (x-, . . . ,  X.J  is  sufficient  for  a  when  v  is  known, 
the  sample  geometric  mean  g  is  sufficient  for  v  when  a  is 
N  Y 

known;  and  (Y-,  1  In  —-)  is  a  joint  set  of  sufficient  statistics 
1  i=l  Y1 

for  (a,v)  when  both  are  unknown.  The  exact  distribution  of 
the  maximum  likelihood  estimator  is  derived. 

31.  Mandel,  J. ,  "A  Method  for  Fitting  Empirical  Surfaces  to  Physical 
or  Chemical  Data",  Technometrics  ll(3):411-429  (1969). 

A  method,  largely  graphical,  for  fitting  a  distribution  to  bi¬ 
variate  data  is  presented.  An  example  is  given.  The  method 
does  not  require  prior  assumptions  as  to  the  form  of  the 
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distribution  to  be  fit.  However,  it  may  not  have  general  appli¬ 
cability  and  needs  further  investigation. 

32.  Marshall,  A.W. ,  and  I.  Olkin,  "A  Multivariate  Exponential 
Distribution",  J.  Amer.  Stat.  Assoc.  62:30-44  (1967). 

A  number  of  multivariate  exponential  distributions  are  known, 
but  they  have  not  been  obtained  by  methods  that  shed  light  on 
their  applicability.  This  paper  presents  some  meaningful 
derivations  of  a  multivariate  exponential  distribution  that  serves 
to  indicate  conditions  under  which  the  distribution  is  appropriate. 
Two  of  these  derivations  are  based  on  "shock  models",  and  one 
is  based  on  the  requirement  that  residual  life  is  independent 
of  age.  It  is  significant  that  the  derivations  all  lead  to  the  same 
distiibution. 

For  this  distribution,  the  moment  generating  function  is  obtained, 
comparison  is  made  with  the  case  of  independence,  the  distri¬ 
bution  of  the  minimum  is  discussed,  and  various  other  proper¬ 
ties  are  investigated.  A  multivariate  Weibull  distribution  is 
obtained  through  a  change  of  variables. 

33.  Massey,  Frank  J.,  Jr.,  "The  Kolmogorov  -  Smirnov  Test 
for  Goodness  of  Fit",  J.  Am.  Stat.  Assoc. ,  46  (1951). 

The  Kolmogorov -Smirnov  test  which  is  based  on  the  maximum 
difference  between  an  empirical  and  hypothetical  cumulative 
distribution  is  discussed.  Percentage  points  are  tabulated, 
and  a  lower  bound  to  the  power  function  is  charted.  Confidence 
units  for  a  cumulative  distribution  are  described.  Examples 
are  given.  Indications  that  the  test  is  superior  to  the  Chi- 
square  test  are  cited. 

34.  Mann,  Nancy  R. ,  "Point  and  Interval  Estimation  Procedures 

for  the  Two-Parameter  Weibull  and  Extreme-Value  Distributions", 
Technometics  10(2);231 -256  (1968). 

Point  estimators  of  parameters  of  the  first  asymptotic  distri¬ 
butions  of  smallest  (extreme)  values,  the  extreme-value  distri¬ 
bution,  are  surveyed  and  compared.  Since  the  logarithms  of 
variates  having  the  two -parameter  Weibull  distribution  are 
variates  from  the  extreme-value  distribution,  the  investigation 
is  applicable  to  the  estimation  of  Weibull  parameters.  Those 
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estimators  investigated  are  maximum -likelihood  and  moment 
estimators,  inefficient  estimators  based  on  only  a  few  ordered 
observations,  and  various  linear  estimation  methods.  A  com¬ 
bination  of  Monte  Carlo  approximations  and  exact  small -sample 
and  asymptotic  results  has  been  used  to  compare  the  expected 
loss  (with  loss  equal  to  squared  error)  of  these  various  point 
estimators.  Interval  estimation  procedures  are  also  discussed. 

35.  McGrath,  E.J.,  Fundamentals  for  Operations  Research, 

West  Co«\st  University,  1970,  Chapter  3. 

Discussion  of  probability  distributions  and  estimators  for  most 
basic  distributions.  Weibull  -  describes  distribution  and  typical 
curves  and  discusses  estimators  for  parameters.  Johnson  - 
defines  distribution,  displays  typical  curve  shapes,  and  gives 
skewness  -  kurtosis  diagram  for  family.  Extensive  discussion, 
with  examples,  of  estimation  of  parameters.  Pearson  -  defines 
distribution  types  and  gives  skewness -kurtosis  plot  for  family. 
Discussion  of  A^-test  for  evaluation  of  fits. 

36.  Meier,  F.A. ,  "Non-Normal  Statistical  Distributions  and  Their 
Use  in  Industrial  Engineering",  Amer,  Inst,  of  Indust.  Eng. , 
Tech.  Papers,  20  Inst.  Conf.  and  Conv.  71-83  (1969). 

Both  the  gamma  and  Weibull  distributions  are  described  with 
comments  on  calculational  methods  and  approximations.  A 
thorough  review  of  methods  for  estimating  parameters  is  given. 

37.  Mengel,  P.R. ,  "Fragility  Curve  Preparation  Methods",  unpub¬ 
lished  memo,  1970. 

Presents  a  methodology  for  fitting  data  from  failure  levels 
to  a  lognormal  distribution.  Theoretical  reasons  underlying 
the  use  of  the  lognormal  for  this  case  are  discussed. 

38.  Menon,  M.W. ,  "Estimation  of  the  Shape  and  Scale  Parameters 
of  the  Weibull  Distribution'*,  Technometrics  5(2):175-182  (1963). 

Estimates  c  and  6  are  proposed  for  the  shape  parameter  c 
and  the  scale  parameter  b  of  the  Weibull  distribution  on  the 
assumption  that  the  location  parameter  is  known.  First  an 
estimate  d  of  1/c  is  found,  the  c  is  obtained  as  1/d.  When 
b  is  unknown,  a  is  a  consistent  and  non-negative  estimate  of 
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d,  with  a  bias  which  tends  to  vanish  as  the  sample  size  increases 
and  with  an  asymptotic  efficiency  of  about  55%.  When  b  is  known, 
3  is  an  unbiased,  non-negative,  and  consistent  estimate  of  d, 
and  its  efficiency  is  approximately  84%.  An  estimate  in  6 
of  tn  b  is  found  with  an  asymptotic  efficiency  of  95%.  It  is 
proposed  that  exp  (in  6)  be  used  to  estimate  b. 

39.  Neave,  H.R.  and  C.W.  J.  Granger,  "A  Monte  Carlo  Study 

Comparing  Various  Two -Sample  Tests  for  Differences  in  Mean", 
Technometrics,  10  (3)  (1968). 

A  study  was  conducted  on  eight  tests  for  differences  in  means 
under  a  variety  of  simulated  experimental  situations.  Estimates 
were  made  of  the  power  of  the  tests  and  measures  made  of 
the  extent  to  which  they  gave  similar  results.  In  particular 
the  performance  of  a  new  quick  test  developed  by  Neave  was 
studied. 

40.  Pearson,  K. ,  "Mathematical  Contributions  to  the  Theory  of 
Evolution  -  Supplement  to  a  Memoir  on  Skew  Variation",  Trans. 
Roy.  Phil.  Soc.  London  197:443-459  (1901). 

One  of  the  classic  papers  introducing  some  of  the  Pearson 
system  distributions  and  giving  some  examples. 

41.  Pearson,  K. ,  "Mathematical  Contributions  to  the  Theory  of 
Evolution  -  Second  Supplement  to  a  Memoir  on  Skew  Variation", 
Trans.  Roy.  Phil.  Soc.  London  A216:429 -457  (1916). 

Classical  paper  setting  forth  the  properties  of  the  Pearson 
system  and  the  distributions  in  it. 

42.  Pearson,  E.S.,  andH.O.  Hartley  (eds),  Biometrika  Tables 
for  Statisticians,  Vol.  I,  sections  23-24,  Cambridge  Univ. 

Press  (1058). 

The  basic  functional  forms  and  some  properties  are  given  for 
each  distribution  in  the  Pearson  system.  Some  applications 
showing  the  fitting  to  empirical  data  are  discussed. 

43.  Pickands,  J.  m,  "Efficient  Estimation  of  a  Probability  Density 
Function",  Ann.  Math.  Statist.  40(3):854-864  (1969). 
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Some  theoretical  results  in  using  the  "kernel  method"  to  esti¬ 
mate  a  probability  density  function  are  derived. 

44.  Plait.  Alan.  "The  Weibull  Distribution  -  with  Tables",  Indus  - 
trial  Quality  Control  19(5):17-26  (1962). 

Describes  Weibull  distribution  and  gives  extensive  tables  to 
aid  in  curve  fitting. 

45.  Press.  S.J..  "The  T-Ratio  Distribution",  J.  Amer.  Stat.  Ass. 
64:242  -252  (1969). 

The  distribution  of  the  ratio  of  correlated  student  T-variates 
is  of  interest  in  problems  in  econometrics  and  ranking  and 
selection.  The  density  of  this  ratio  is  derived  and  computer 
graphs  of  the  density  are  given  in  terms  of  standardized  variates. 
Frac tiles  are  given  for  selected  parameter  values.  It  is  shown 
that  the  distribution  contains  no  moments. 

46.  Schwartz,  S.C. ,  "Estimation  of  a  Probability  Density  by  an 
Orthogonal  Series",  Ann.  Math.  Statist.  38:1261-1265  (1967). 

The  estimation  of  an  unknown  probability  desnity  function  from 
a  realized  sequence  of  random  numbers  is  considered.  An 
approximation  in  terms  of  a  sum  of  Hermite  polynomials  is 
made  and  equations  for  the  coefficients  are  derived.  Conver¬ 
gence  to  the  correct  density  function  is  proven  and  convergence 
rates  are  calculated.  Comparison  to  the  kernel  method  is 
made. 

47.  Shapiro,  S.S.  and  M.B.  Wilk,  "An  Analysis  of  Variance  Test 

For  Normality  (Complete  Samples)",  Biometrika,  52 
(1965).  — 

A  new  statistical  procedure  (W  Test)  for  testing  a  complete 
sample  for  normality  is  presented.  The  test  statistic  is  ob¬ 
tained  by  dividing  the  square  of  an  appropriate  linear  combin¬ 
ation  of  the  sample  order  statistics  by  the  usual  symmetric 
estimate  of  variance.  Presented  are  derivation,  properties, 
and  applications  of  the  W  test  and  comparison  with  other  tests. 

48.  Suzuki,  Giitiro,  "On  Exact  Probabilities  of  Some  Generalized 

Kilmogorov's  D-Statistics",  Institute  on  Statistical  Mathematics, 
Annals,  Tokyo,  19  (1967). 
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i’h.  i  paper  gives  a  unified  computational  method  for  exact 
probabilities  of  the  most  generalised  form  of  the  D-statistic 
proposed  by  Kolmogorov  for  non-parametric  tests  of  fit. 

First,  a  historical  survey  of  the  subject  is  given  and  then 
goodness -of -fit  D-tests  are  stated  (based  on  some  general 
bounds)  by  constructing  general  acceptance  and  confidence 
regions,  sixes  of  which  are  calculated  in  a  distribution -free 
way.  Tlie  method  is  also  applied  to  calculation  of  the  exact 
power  of  tests  for  a  certain  continuous  alternative.  A  com¬ 
putational  method  for  the  functional  a^(. . . )  is  presented. 

49.  Takahasi,  K. ,  and  K.  Wald  mo  to,  'On  Unbiased  Estimates 

of  the  Population  Mean  Basod  on  the  Sample  Stratified  by  Means 
of  Ordering",  Ann.  Inst,  on  Stat.  Math. ,  Tokyo,  20:1-31  (1968). 

In  many  experimental  situations,  it  is  costly  and  time-consuming 
to  make  accurate  measurements  while  at  the  same  time  Judg¬ 
ments  as  to  relative  order  of  size  can  be  made  easily.  This 
paper  describes  techniques  for  ordering  subgroups  of  a  large 
sample,  then  picking  a  smaller  sample,  using  the  stratification 
induced  by  the  ordering.  Accurate  measurements  are  made 
only  on  the  smaller  sample.  An  unbiased  estimate  of  the  popu¬ 
lation  mean  can  be  generated  from  this  small  sample  with  much 
less  variance  than  would  be  obtained  in  estimating  from  a  sample 
of  similar  size,  but  randomly  chosen.  This  is  basically  an 
example  of  stratified  sampling,  but  as  applied  prior  to  experi¬ 
mental  measuring  rather  than  to  choices  made  in  simulation. 

50.  Tarter,  M.E. ,  R.L.  Holcomb,  andR.A.  Kronman,  "After 

the  Histogram,  What?  A  Description  of  New  Computer  Methods 
for  Estimating  the  Population  Density",  Proc.  ACM  22nd  Natl. 
Conf.  P -67:511-519  (1967). 

The  kernel  method  for  estimating  a  probability  density  function 
from  a  sequence  of  random  observations  is  discussed.  As 
an  alternative,  a  Fourier  expansion  is  considered  for  an  esti¬ 
mate  of  the  density.  Restrictions  on  the  function  and  the  optimum 
order  of  the  expansion  is  derived.  Computer  implementation 
of  this  algorithm  is  discussed  and  several  applications  are 
displayed. 

51.  Thoman,  D.R. ,  L.J.  Bain,  and  C.  E.  Antle,  "Inferences  on 
the  Parameters  of  the  Weibull  Distribution",  Technometrics 
ll(3):445-480  (1969). 
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The  problems  of  estimation  and  testing  hypotheses  regarding 
the  parameters  in  the  Weibull  distribution  are  considered  in 
this  paper.  The  following  results  are  given: 

1.  Exact  confidence  intervals  for  the  parameters 
based  on  maximum  likelihood  estimators  are 
presented. 

2.  A  table  of  unbiasing  factors  (depending  upon 
sample  sire)  for  the  maximum  likelihood  esti¬ 
mator  of  the  shape  parameter  are  given. 

3.  Test  of  hypotheses  regarding  the  parameters 
and  the  power  of  the  test  regarding  the  shape 
parameter  are  developed  and  presented. 

4.  Sample  sizes  at  which  large  sample  theory 
may  be  useful  are  presented. 

52.  Thornber,  H. ,  "Finite  Sample  Monte  Carlo  Studies:  An  Auto¬ 

regressive  Illustration",  J.  Amer.  Stat.  Assoc.  62:801-818 
(1967).  ” 

In  this  paper  the  problem  of  choosing  among  point  estimators 
on  the  basis  of  their  small  sample  properties  is  discussed 
from  the  sampling  point  of  view.  The  indeterminacy  of  most 
Monte  Carlo  studies  is  analyzed  and  resolved  within  the  frame¬ 
work  of  statistical  decision  theory.  A  first  order  auto-regres- 
sive  model  is  worked  through  in  detail  both  for  its  own  sake 
and  to  illustrate  how  a  complete  Monte  Carlo  study  might  be 
done. 

53.  Weibull,  W. ,  "A  Statistical  Distribution  Function  of  Wide 
Applicability”,  J.  App.  Mech,  18(3):293-297  (1951). 

Introduces  the  Weibull  distribution  and  gives  several  examples 
of  fitting  to  it. 

54.  Weiss,  L. ,  and  J.  Wolfowitz,  "Maximum  Probability  Estima¬ 
tors",  Ann.  Inst.  Stat.  Math.  Tokyo  19  193-206  (1967). 

A  new  class  of  estimators,  called  maximum  probability  esti¬ 
mators,  is  suggested  as  an  alternative  to  maximum  likelihood 
estimators. 
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55.  White,  J.8. ,  ’The  Moments  of  Log-Weibull  Order  Statistics”, 
Technometrics  H:373-3M  (I960). 

Formulas  for  the  moments  of  the  order  statistics  of  a  general 
distribution  are  derived.  Then  the  log-Weibull  distribution 
is  introduced  and  the  moments  of  its  order  statistics  are  cal¬ 
culated.  An  application  showing  how  this  can  be  applied  to  the 
fitting  of  a  Weibull  distribution  to  empirical  data  is  given. 
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